StreamingGgufParametersLoader
class StreamingGgufParametersLoader(sourceProvider: () -> RandomAccessSource, onProgress: (current: Long, total: Long, message: String?) -> Unit = { _, _, _ -> }) : ParametersLoader(source)
Streaming GGUF parameters loader — the recommended path for loading GGUF models.
Unlike GgufParametersLoader (which uses the legacy GGUFReader and rejects quantized types), this loader:
Uses StreamingGGUFReader for memory-efficient parsing
Supports quantized types (Q4_K, Q8_0) as packed TensorData
Loads tensor data on-demand without heap-loading the full file
Preserves quantized layout through the loading pipeline
For F32 and I32 tensors, data is returned as standard dense arrays. For quantized tensors, data is returned as packed block storage (e.g., Q4_KBlockTensorData, Q8_0BlockTensorData).
Constructors
Link copied to clipboard
constructor(sourceProvider: () -> RandomAccessSource, onProgress: (current: Long, total: Long, message: String?) -> Unit = { _, _, _ -> })