matmulQ4_K

Matrix multiplication with Q4_K quantized weights.

FP32 output tensor of shape batch, outputDim or outputDim

input

FP32 input tensor of shape batch, inputDim or inputDim

weights

Q4_K quantized weight data of shape inputDim, outputDim

ctx

ExecutionContext for creating the output tensor