Q4_KBlockTensorData
class Q4_KBlockTensorData(initialShape: Shape, data: ByteArray) : Q4_KTensorData, PackedBlockStorage(source)
Implementation of Q4_KTensorData backed by a packed byte array.
Memory layout per block (144 bytes):
bytes 0..1: f16 d (little-endian)
bytes 2..3: f16 dMin (little-endian)
bytes 4..15: packed 12-bit scale/min indices (12 bytes)
bytes 16..143: 4-bit quantized codes (128 bytes, 2 codes per byte)
Scale packing: Each sub-block uses 12 bits (6 for scaleIdx, 6 for minIdx). 8 sub-blocks × 12 bits = 96 bits = 12 bytes.
Parameters
initialShape
the logical shape of the tensor (in elements, not blocks)
packedData
the raw packed block data
Properties
Functions
Link copied to clipboard
Dequantize a single block to float values.
Link copied to clipboard
Get the minimum scale factor (dMin) for a block.
Link copied to clipboard
Get the minimum value for a specific sub-block within a block.
Link copied to clipboard
Get the scale for a specific sub-block within a block.