skainet-lang-core/sk.ainet.lang.tensor.data/Q8_0TensorData

Q8_0TensorData

interface Q8_0TensorData : TensorData<DType, Byte> (source)

Tensor data interface for Q8_0 quantized format.

Q8_0 block format (32 elements per block, 34 bytes per block):

2 bytes: f16 scale
32 bytes: int8 quantized codes

Dequantization: outputi = codei * scale

This interface enables direct quantized matmul operations without full dequantization, providing significant memory and compute savings for inference.

Inheritors

Q8_0BlockTensorData

Types

Companion

object Companion

Properties

blockCount

abstract val blockCount: Int

Number of Q8_0 blocks in the tensor.

packedData

abstract val packedData: ByteArray

Raw packed data containing all blocks.

Functions

getBlockScale

abstract fun getBlockScale(blockIdx: Int): Float

Get the scale factor for a specific block.

getCode

abstract fun getCode(blockIdx: Int, elementIdx: Int): Byte

Get a quantized code value within a block (0..31).

toFloatArray

fun Q8_0TensorData.toFloatArray(): FloatArray

Dequantize Q8_0 tensor data to a FloatArray. outputi = codei * scale