skainet-io-gguf/sk.ainet.io.gguf.llama/QuantizedTensorFactoryJvm

QuantizedTensorFactoryJvm

object QuantizedTensorFactoryJvm(source)

JVM extensions for QuantizedTensorFactory that produce MemorySegment-backed quantized tensor data for SIMD-friendly access patterns.

Usage:

val arena = Arena.ofShared()
val rawTensor: Tensor<Int8, Byte> = ... // loaded with RAW_BYTES policy
val q8Data = rawTensor.toQ8_0MemSeg(logicalShape, arena)
val q4Data = rawTensor.toQ4_0MemSeg(logicalShape, arena)

Properties

SUPPORTED_MEMSEG_TYPES

val SUPPORTED_MEMSEG_TYPES: Set<GGMLQuantizationType>

Quantization types that support MemorySegment-backed tensor data.

Functions

supportsMemSegQuantized

fun supportsMemSegQuantized(quantType: GGMLQuantizationType): Boolean

Check if a quantization type supports MemorySegment-backed tensor data.

toQ4_0MemSeg

fun toQ4_0MemSeg(rawTensor: Tensor<Int8, Byte>, logicalShape: Shape, arena: Arena): Q4MemorySegmentTensorData

Convert a raw byte tensor to Q4_0 MemorySegment-backed data.

toQ8_0MemSeg

fun toQ8_0MemSeg(rawTensor: Tensor<Int8, Byte>, logicalShape: Shape, arena: Arena): Q8MemorySegmentTensorData

Convert a raw byte tensor to Q8_0 MemorySegment-backed data.