loadLlamaRuntimeWeightsDequantized

suspend fun <T : DType> loadLlamaRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source, dtype: KClass<T>): LlamaRuntimeWeights<T>(source)

Convenience helper to force dequantization to FP32 (where supported) and fail if any unsupported quant types remain.


suspend fun loadLlamaRuntimeWeightsDequantized(ctx: ExecutionContext, sourceProvider: () -> Source): LlamaRuntimeWeights<FP32>(source)

Backward-compatible overload defaulting to FP32.