StreamingSafeTensorsReader

Streaming SafeTensors reader that parses metadata without loading tensor data.

Memory usage is proportional to header size (typically <1 MB), not file size. Individual tensors can be loaded on-demand via loadTensorData.

SafeTensors format:

  • 8 bytes: header size (little-endian u64)

  • N bytes: JSON header with tensor metadata

  • Remaining: raw tensor data at specified offsets

Usage:

StreamingSafeTensorsReader.open(source).use { reader ->
// Access metadata immediately - only header loaded
println("Tensors: ${reader.tensors.size}")
println("Metadata: ${reader.metadata}")

// Load specific tensor when needed
val weights = reader.loadTensorData("model.embed_tokens.weight")
}

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard

Byte offset where tensor data begins

Link copied to clipboard

Header size in bytes

Link copied to clipboard

Custom metadata from metadata field

Link copied to clipboard

Parsed tensor metadata (without actual tensor data)

Functions

Link copied to clipboard
open override fun close()
Link copied to clipboard

Load tensor data by name.

Load tensor data for a specific tensor.

fun loadTensorData(tensor: StreamingSafeTensorInfo, buffer: ByteArray, offset: Int = 0): Int

Load tensor data into an existing buffer.