LlamaWeightLoader

constructor(sourceProvider: () -> Source, loadTensorData: Boolean = true, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES)(source)

Primary constructor for sequential Source-based loading. Loads entire file into memory - suitable for models under 2GB.


constructor(randomAccessProvider: () -> RandomAccessSource, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES)(source)

Secondary constructor for streaming RandomAccessSource-based loading. Parses metadata only (~1MB memory) and loads tensors on-demand. Suitable for models of any size (100+ GB).