Q8_0BlockTensorData
Implementation of Q8_0TensorData backed by a packed byte array.
Memory layout per block:
bytes 0..1: f16 scale (little-endian)
bytes 2..33: 32 int8 quantized codes
Parameters
initialShape
the logical shape of the tensor (in elements, not blocks)
packedData
the raw packed block data