resolve

fun resolve(annotation: KvCache, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheStore(source)

Resolve a KvCache annotation to a KvCacheStore.

Parameters

annotation

The @KvCache annotation values

numLayers

Number of transformer layers

numHeads

Number of KV heads per layer

headDim

Dimension per head

maxSeqLen

Maximum sequence length (overridden by annotation if > 0)


fun resolve(preset: String, numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheStore(source)

Resolve a preset name string to a KvCacheStore.

Convenience for when you have the preset name but not the full annotation.

Parameters

preset

"dense", "safe-lowbit", "balanced", or "experimental-max"

numLayers

Number of transformer layers

numHeads

Number of KV heads per layer

headDim

Dimension per head

maxSeqLen

Maximum sequence length