Q4_KTensorData

Tensor data interface for Q4_K quantized format.

Q4_K block format (256 elements per block, 144 bytes per block):

  • 2 bytes: f16 d (main scale)

  • 2 bytes: f16 dMin (minimum scale)

  • 12 bytes: packed scales (8 sub-blocks × 12 bits each = 96 bits = 12 bytes)

  • 128 bytes: 4-bit quantized codes (256 elements / 2 = 128 bytes)

Each sub-block (32 elements):

  • 6-bit scale index (0..63)

  • 6-bit min index (0..63)

  • scale = d * (scaleIdx / 63)

  • min = dMin * (minIdx / 63)

Dequantization: outputi = codei * scale + min

Inheritors

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard
abstract val blockCount: Int

Number of Q4_K blocks in the tensor.

Link copied to clipboard
abstract val packedData: ByteArray

Raw packed data containing all blocks.

Functions

Link copied to clipboard
abstract fun getBlockD(blockIdx: Int): Float

Get the main scale factor (d) for a block.

Link copied to clipboard
abstract fun getBlockDMin(blockIdx: Int): Float

Get the minimum scale factor (dMin) for a block.

Link copied to clipboard
abstract fun getCode(blockIdx: Int, elementIdx: Int): Int

Get a 4-bit quantized code value (0..255 elements within block).

Link copied to clipboard
abstract fun getSubBlockMin(blockIdx: Int, subBlockIdx: Int): Float

Get the minimum value for a specific sub-block within a block.

Link copied to clipboard
abstract fun getSubBlockScale(blockIdx: Int, subBlockIdx: Int): Float

Get the scale for a specific sub-block within a block.

Link copied to clipboard

Dequantize Q4_K tensor data to a FloatArray. outputi = codei * scale + min