JvmTurboQuantKernels

JVM SIMD-optimized kernels for TurboQuant operations.

Uses the Java Vector API (jdk.incubator.vector) for CPU SIMD acceleration of TurboQuant encode/decode paths. Falls back to scalar code for non-aligned tails.

These kernels optimize the hot paths:

  • Per-group abs-max computation (for scale calculation)

  • Vectorized quantization (float → code)

  • Vectorized dequantization (code → float)

  • Walsh-Hadamard transform butterfly stages

Usage: Called by the CPU backend when TurboQuant-encoded K/V is detected in the attention path.

Functions

Link copied to clipboard
fun absMax(data: FloatArray, offset: Int, length: Int): Float

Find the maximum absolute value in a float array segment. SIMD-accelerated with scalar tail.

Link copied to clipboard
fun dequantize(codes: ByteArray, scales: FloatArray, output: FloatArray, offset: Int = 0)

SIMD-accelerated dequantization.

Link copied to clipboard

SIMD-accelerated scalar quantization with per-group scales.

Link copied to clipboard

SIMD-accelerated Walsh-Hadamard transform butterfly stage.