Mathematical Theory Reference
This section contains mathematical definitions and theoretical foundations for SKaiNET operators.
Operator Theory
Matrix Multiplication Theory
The theory and intuition below are the single source of truth for matmul
and are also embedded into the generated operator reference at
TensorOps.matmul.
Mathematical Definition
Given two matrices \(A \in \mathbb{R}^{m \times k}\) and \(B \in \mathbb{R}^{k \times n}\), the matrix product \(C = AB\) is defined as:
Where \(C \in \mathbb{R}^{m \times n}\), \(i\) ranges over rows \(1..m\), \(j\) over columns \(1..n\), and \(l\) is the summation index over the shared dimension \(k\).
Intuition and Properties
Matrix multiplication composes two linear transformations: each output element is the dot product of a row of \(A\) with a column of \(B\). It is the core primitive behind fully-connected layers, attention projections, and any linear map in a neural network’s forward pass.
Key properties:
-
Associativity: \((AB)C = A(BC)\)
-
Distributivity: \(A(B + C) = AB + AC\)
-
Non-commutativity: in general \(AB \neq BA\)
-
Identity: \(AI = IA = A\)
Complexity:
-
Standard algorithm: \(O(mnk)\)
-
Strassen’s algorithm: \(O(n^{2.807})\) for square matrices
-
Current theoretical best: \(O(n^{2.373})\)
Applications
-
Neural network forward pass computations
-
Linear transformations in computer graphics
-
Solving systems of linear equations
-
Principal component analysis (PCA)