loadLlamaRuntimeWeights

suspend fun <T : DType> loadLlamaRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): LlamaRuntimeWeights<T>(source)

Convenience loader: reads weights from GGUF source, maps them into runtime structure.

suspend fun loadLlamaRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): LlamaRuntimeWeights<FP32>(source)

Backward-compatible overload defaulting to FP32.