loadLlamaRuntimeWeightsDequantized
suspend fun <T : DType> loadLlamaRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>): LlamaRuntimeWeights<T>(source)
Convenience helper to force dequantization to FP32 (where supported) and fail if any unsupported quant types remain.
suspend fun loadLlamaRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source): LlamaRuntimeWeights<FP32>(source)
Backward-compatible overload defaulting to FP32.