matmulQ8_0
fun matmulQ8_0(input: Tensor<FP32, Float>, weights: Q8_0TensorData, ctx: ExecutionContext): Tensor<FP32, Float>(source)
Matrix multiplication with Q8_0 quantized weights.
Return
FP32 output tensor of shape batch, outputDim or outputDim
Parameters
input
FP32 input tensor of shape batch, inputDim or inputDim
weights
Q8_0 quantized weight data of shape inputDim, outputDim
ctx
ExecutionContext for creating the output tensor