q8

fun q8(numLayers: Int, numHeads: Int, headDim: Int, maxSeqLen: Int): KvCacheConfig(source)

Q8_0-compressed cache for both K and V.