loadLlamaRuntimeWeights
suspend fun <T : DType> loadLlamaRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): LlamaRuntimeWeights<T>(source)
Convenience loader: reads weights from GGUF source, maps them into runtime structure.
suspend fun loadLlamaRuntimeWeights(ctx: ExecutionContext, sourceProvider: () -> Source, quantPolicy: QuantPolicy = QuantPolicy.RAW_BYTES, allowQuantized: Boolean = false): LlamaRuntimeWeights<FP32>(source)
Backward-compatible overload defaulting to FP32.