forModel

fun forModel(preset: String, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset(source)

Look up a preset by name and apply model dimensions.

This is the primary entry point for skainet-transformers and other consumers that want to enable TurboQuant with a single call.

Example:

val preset = TurboQuantPresets.forModel("balanced", numLayers=32, numHeads=32, headDim=128, maxSeqLen=4096)
val cache = KvCacheStore.fromPreset(preset)

Parameters

preset

One of "safe-lowbit", "balanced", "experimental-max"

numLayers

Number of transformer layers

numHeads

Number of KV heads per layer

headDim

Dimension per head

maxSeqLen

Maximum sequence length

Throws

if preset name is unknown