TurboQuantPresets

Named preset configurations for TurboQuant KV-cache compression.

Presets reflect the practical observation that key precision is often more quality-sensitive than value precision.

Available presets:

  • safe-lowbit: Q8_0 keys + TurboQuant-4 values (conservative)

  • balanced: TurboQuant-4 keys + TurboQuant-4 values

  • experimental-max: TurboQuant-3 keys + TurboQuant-3 values (aggressive)

Properties

Link copied to clipboard

List all available preset names.

Functions

Link copied to clipboard
fun balanced(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset

Balanced preset: TurboQuant-4 for both keys and values.

Link copied to clipboard
fun experimentalMax(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset

Experimental maximum compression: TurboQuant-3 for both K and V.

Link copied to clipboard
fun forModel(preset: String, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset

Look up a preset by name and apply model dimensions.

Link copied to clipboard
fun safeLowbit(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): TurboQuantPreset

Safe low-bit preset: Q8_0 for keys, TurboQuant-4 for values.