matmulQ4_K
fun matmulQ4_K(input: Tensor<FP32, Float>, weights: Q4_KTensorData, ctx: ExecutionContext): Tensor<FP32, Float>(source)
Matrix multiplication with Q4_K quantized weights.
Return
FP32 output tensor of shape batch, outputDim or outputDim
Parameters
input
FP32 input tensor of shape batch, inputDim or inputDim
weights
Q4_K quantized weight data of shape inputDim, outputDim
ctx
ExecutionContext for creating the output tensor