Q8_0TensorData
Tensor data interface for Q8_0 quantized format.
Q8_0 block format (32 elements per block, 34 bytes per block):
2 bytes: f16 scale
32 bytes: int8 quantized codes
Dequantization: outputi = codei * scale
This interface enables direct quantized matmul operations without full dequantization, providing significant memory and compute savings for inference.