ScalarQuantizer

Scalar quantization and codebook lookup for TurboQuant.

After random rotation spreads quantization error uniformly, scalar quantization maps each element independently to an N-bit integer code. This is simpler and faster than vector quantization while achieving good quality thanks to the rotation preprocessing.

The quantizer uses a uniform symmetric scheme:

  • Compute per-group scale = max(abs(group)) / ((2^(bits-1)) - 1)

  • Quantize: code = round(value / scale), clamped to -2^(bits-1)+1, 2^(bits-1)-1

  • Dequantize: value ≈ code * scale

Groups of 32 elements share a single FP16 scale factor.

Properties

Link copied to clipboard
const val GROUP_SIZE: Int = 32

Number of elements per quantization group.

Functions

Link copied to clipboard

Dequantize codes back to float values using stored scales.

Link copied to clipboard
fun dequantizeInto(codes: ByteArray, scales: FloatArray, output: FloatArray, offset: Int = 0)

Dequantize codes in-place into an existing output array.

Link copied to clipboard

Quantize a float vector to integer codes with per-group scales.