skainet-lang-core/sk.ainet.lang.tensor.ops.turboquant/ScalarQuantizer

ScalarQuantizer

Scalar quantization and codebook lookup for TurboQuant.

After random rotation spreads quantization error uniformly, scalar quantization maps each element independently to an N-bit integer code. This is simpler and faster than vector quantization while achieving good quality thanks to the rotation preprocessing.

The quantizer uses a uniform symmetric scheme:

Compute per-group scale = max(abs(group)) / ((2^(bits-1)) - 1)
Quantize: code = round(value / scale), clamped to -2^(bits-1)+1, 2^(bits-1)-1
Dequantize: value ≈ code * scale

Groups of 32 elements share a single FP16 scale factor.

Properties

GROUP_SIZE

const val GROUP_SIZE: Int = 32

Number of elements per quantization group.

Functions

dequantize

fun dequantize(quantized: QuantizedVector): FloatArray

Dequantize codes back to float values using stored scales.

dequantizeInto

fun dequantizeInto(codes: ByteArray, scales: FloatArray, output: FloatArray, offset: Int = 0)

Dequantize codes in-place into an existing output array.

quantize

fun quantize(input: FloatArray, bits: Int): QuantizedVector

Quantize a float vector to integer codes with per-group scales.