StreamingSafeTensorsReader
Streaming SafeTensors reader that parses metadata without loading tensor data.
Memory usage is proportional to header size (typically <1 MB), not file size. Individual tensors can be loaded on-demand via loadTensorData.
SafeTensors format:
8 bytes: header size (little-endian u64)
N bytes: JSON header with tensor metadata
Remaining: raw tensor data at specified offsets
Usage:
StreamingSafeTensorsReader.open(source).use { reader ->
// Access metadata immediately - only header loaded
println("Tensors: ${reader.tensors.size}")
println("Metadata: ${reader.metadata}")
// Load specific tensor when needed
val weights = reader.loadTensorData("model.embed_tokens.weight")
}Content copied to clipboard