Package-level declarations

Types

Link copied to clipboard

Parser for HuggingFace Gemma 3n config.json files.

Link copied to clipboard

GGUF tensor naming for Gemma 3n models.

Link copied to clipboard
data class Gemma3nLayerWeights<T : DType>(val inputLayernorm: Tensor<T, Float>, val wq: Tensor<T, Float>, val wk: Tensor<T, Float>, val wv: Tensor<T, Float>, val wo: Tensor<T, Float>, val postAttentionLayernorm: Tensor<T, Float>, val gateProj: Tensor<T, Float>, val upProj: Tensor<T, Float>, val downProj: Tensor<T, Float>, val perLayerInput: Tensor<T, Float>?, val perLayerOutput: Tensor<T, Float>?)

Weights for a single Gemma 3n transformer layer.

Link copied to clipboard
data class Gemma3nModelMetadata(val architecture: String, val embeddingLength: Int, val perLayerEmbeddingLength: Int, val contextLength: Int, val blockCount: Int, val headCount: Int, val kvHeadCount: Int, val feedForwardLengths: List<Int>, val headDim: Int, val vocabSize: Int, val slidingWindow: Int, val ropeBaseLocal: Float, val ropeBaseGlobal: Float, val kvSharedLayers: Int, val layerPattern: List<String>)

Metadata for Gemma 3n models extracted from GGUF files.

Link copied to clipboard
data class Gemma3nRuntimeWeights<T : DType>(val metadata: Gemma3nModelMetadata, val tokenEmbedding: Tensor<T, Float>, val ropeFreqReal: Tensor<T, Float>?, val ropeFreqImag: Tensor<T, Float>?, val layers: List<Gemma3nLayerWeights<T>>, val finalNorm: Tensor<T, Float>, val lmHead: Tensor<T, Float>, val quantTypes: Map<String, GGMLQuantizationType> = emptyMap())

Complete runtime weights for Gemma 3n model.

Link copied to clipboard

Loads Gemma 3n weights from HuggingFace SafeTensors format.

Link copied to clipboard

Tensor name constants for Gemma 3n GGUF format.

Link copied to clipboard

Adapter that loads Gemma 3n weights from GGUF files.

Link copied to clipboard

Maps raw weights to runtime structure with shape validation.

Link copied to clipboard
data class Gemma3nWeights<T : DType, V>(val metadata: Gemma3nModelMetadata, val tensors: Map<String, Tensor<T, V>>, val quantTypes: Map<String, GGMLQuantizationType> = emptyMap())

Raw weights loaded from GGUF, before mapping to runtime structure.

Link copied to clipboard

Type of attention layer in Gemma 3n.

Functions

Link copied to clipboard
suspend fun loadGemma3nRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): Gemma3nRuntimeWeights<FP32>

Backward-compatible overload defaulting to FP32.

suspend fun <T : DType> loadGemma3nRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): Gemma3nRuntimeWeights<T>

Convenience loader: reads weights from GGUF source, maps them into runtime structure.

Link copied to clipboard
suspend fun loadGemma3nRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source): Gemma3nRuntimeWeights<FP32>

Backward-compatible overload defaulting to FP32.

suspend fun <T : DType> loadGemma3nRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>): Gemma3nRuntimeWeights<T>

Load Gemma 3n runtime weights with dequantization.

Backward-compatible overload defaulting to FP32.

Load Gemma 3n runtime weights using streaming API with dequantization.

Link copied to clipboard

Backward-compatible overload defaulting to FP32.

Load Gemma 3n runtime weights from SafeTensors format.

Link copied to clipboard
suspend fun loadGemma3nRuntimeWeightsStreaming(ctx: ExecutionContext, randomAccessProvider: () -> RandomAccessSource, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): Gemma3nRuntimeWeights<FP32>

Backward-compatible overload defaulting to FP32.

suspend fun <T : DType> loadGemma3nRuntimeWeightsStreaming(ctx: ExecutionContext, randomAccessProvider: () -> RandomAccessSource, dtype: KClass<T>, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): Gemma3nRuntimeWeights<T>

Load Gemma 3n runtime weights using streaming API.