Companion

Functions

Link copied to clipboard
fun dense(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheConfig

Uncompressed FP32 cache (baseline).

Link copied to clipboard
fun q8(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheConfig

Q8_0-compressed cache for both K and V.