resolve
fun resolve(annotation: KvCache, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheStore(source)
Resolve a KvCache annotation to a KvCacheStore.
Parameters
annotation
The @KvCache annotation values
numLayers
Number of transformer layers
numHeads
Number of KV heads per layer
headDim
Dimension per head
maxSeqLen
Maximum sequence length (overridden by annotation if > 0)
fun resolve(preset: String, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheStore(source)
Resolve a preset name string to a KvCacheStore.
Convenience for when you have the preset name but not the full annotation.
Parameters
preset
"dense", "safe-lowbit", "balanced", or "experimental-max"
numLayers
Number of transformer layers
numHeads
Number of KV heads per layer
headDim
Dimension per head
maxSeqLen
Maximum sequence length