forModel
fun forModel(preset: String, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset(source)
Look up a preset by name and apply model dimensions.
This is the primary entry point for skainet-transformers and other consumers that want to enable TurboQuant with a single call.
Example:
val preset = TurboQuantPresets.forModel("balanced", numLayers=32, numHeads=32, headDim=128, maxSeqLen=4096)
val cache = KvCacheStore.fromPreset(preset)Content copied to clipboard
Parameters
preset
One of "safe-lowbit", "balanced", "experimental-max"
numLayers
Number of transformer layers
numHeads
Number of KV heads per layer
headDim
Dimension per head
maxSeqLen
Maximum sequence length
Throws
if preset name is unknown