skainet-io-gguf/sk.ainet.io.gguf/StreamingGgufParametersLoader

StreamingGgufParametersLoader

class StreamingGgufParametersLoader(sourceProvider: () -> RandomAccessSource, onProgress: (current: Long, total: Long, message: String?) -> Unit = { _, _, _ -> }) : ParametersLoader(source)

Streaming GGUF parameters loader — the recommended path for loading GGUF models.

Unlike GgufParametersLoader (which uses the legacy GGUFReader and rejects quantized types), this loader:

Uses StreamingGGUFReader for memory-efficient parsing
Supports quantized types (Q4_K, Q8_0) as packed TensorData
Loads tensor data on-demand without heap-loading the full file
Preserves quantized layout through the loading pipeline

For F32 and I32 tensors, data is returned as standard dense arrays. For quantized tensors, data is returned as packed block storage (e.g., Q4_KBlockTensorData, Q8_0BlockTensorData).

Constructors

StreamingGgufParametersLoader

constructor(sourceProvider: () -> RandomAccessSource, onProgress: (current: Long, total: Long, message: String?) -> Unit = { _, _, _ -> })

Functions

load

open suspend override fun <T : DType, V> load(ctx: ExecutionContext, dtype: KClass<T>, onTensorLoaded: (String, Tensor<T, V>) -> Unit)