Mathematical Theory Reference

This section contains mathematical definitions and theoretical foundations for SKaiNET operators.

Operator Theory

Linear Algebra Operations

Matrix Multiplication Theory

The theory and intuition below are the single source of truth for matmul and are also embedded into the generated operator reference at TensorOps.matmul.

Mathematical Definition

Given two matrices \(A \in \mathbb{R}^{m \times k}\) and \(B \in \mathbb{R}^{k \times n}\), the matrix product \(C = AB\) is defined as:

\[C_{ij} = \sum_{l=1}^{k} A_{il} \cdot B_{lj}\]

Where \(C \in \mathbb{R}^{m \times n}\), \(i\) ranges over rows \(1..m\), \(j\) over columns \(1..n\), and \(l\) is the summation index over the shared dimension \(k\).

Intuition and Properties

Matrix multiplication composes two linear transformations: each output element is the dot product of a row of \(A\) with a column of \(B\). It is the core primitive behind fully-connected layers, attention projections, and any linear map in a neural network’s forward pass.

Key properties:

Associativity: \((AB)C = A(BC)\)
Distributivity: \(A(B + C) = AB + AC\)
Non-commutativity: in general \(AB \neq BA\)
Identity: \(AI = IA = A\)

Complexity:

Standard algorithm: \(O(mnk)\)
Strassen’s algorithm: \(O(n^{2.807})\) for square matrices
Current theoretical best: \(O(n^{2.373})\)

Applications

Neural network forward pass computations
Linear transformations in computer graphics
Solving systems of linear equations
Principal component analysis (PCA)

References

Matrix multiplication — Wikipedia
Strassen, V. (1969). Gaussian elimination is not optimal. Numerische Mathematik.

Mathematical Theory Reference

Operator Theory

Linear Algebra Operations

Matrix Multiplication Theory

Mathematical Definition

Intuition and Properties

Applications

References

Activation Functions

Convolution Operations

Normalization Operations

Loss Functions

Cross-References