PackedBlockStorage

Shared contract for all packed/quantized block tensor storage formats.

Instead of each quantization format (Q4_K, Q8_0, Ternary, …) inventing its own loader, planner, and backend handling path, all packed formats implement this interface. Backends and planners can dispatch on encoding without knowing every possible quantization scheme.

Individual formats still expose format-specific accessors (sub-block scales, code extraction, etc.) through their own sub-interfaces.

Inheritors

Properties

Link copied to clipboard
abstract val blockCount: Int

Number of blocks in this storage.

Link copied to clipboard
abstract val blockSize: Int

Number of logical elements per block.

Link copied to clipboard
open val elementCount: Long

Logical element count.

Link copied to clipboard
abstract val encoding: TensorEncoding

The physical encoding describing the block layout.

Link copied to clipboard
abstract val packedData: ByteArray

Raw packed byte data containing all blocks.

Link copied to clipboard

Physical byte size of the packed data.

Link copied to clipboard
abstract val shape: Shape

The logical shape of the tensor (element count, not block count).

Functions

Link copied to clipboard
abstract fun dequantizeBlock(blockIdx: Int, output: FloatArray, outputOffset: Int = 0)

Dequantize a single block to float values.

Link copied to clipboard

Dequantize the entire tensor to a FloatArray. Default implementation calls dequantizeBlock for each block.

Link copied to clipboard
open fun toTensorStorage(logicalType: LogicalDType = LogicalDType.FLOAT32, placement: Placement = Placement.CPU_HEAP): TensorStorage

Convert this packed storage to a TensorStorage descriptor.