Gemma3nWeightLoader
Adapter that loads Gemma 3n weights from GGUF files.
Key differences from LlamaWeightLoader:
Architecture validation: accepts "gemma3n", "gemma3", "gemma" prefixes
Variable intermediate (FFN) sizes per layer
Per-layer embedding support
Hybrid attention metadata extraction
Constructors
Primary constructor for sequential Source-based loading. Loads entire file into memory - suitable for models under 2GB.
Secondary constructor for streaming RandomAccessSource-based loading. Parses metadata only (~1MB memory) and loads tensors on-demand. Suitable for models of any size (100+ GB).
Functions
Load weights and invoke onTensorLoaded for each required tensor. Returns parsed metadata.
Load weights using streaming API - parses metadata only, loads tensors on-demand. Requires randomAccessProvider constructor.
Convenience helper that collects tensors into a map alongside metadata.
Load weights to map using streaming API. Requires randomAccessProvider constructor.