Anti Commutative Property
Sigma representation
Parent methods of multiple matrix multiplication is important optimization for computing time
IBM researcher Shumuel Winograd proved 6 computation or less is impossible for 3x3 multiplication
Matrix Multiplication Algorithms

Strangely, Matrix Multiplications on GPUs Run Faster When Given "Predictable" Data! [short]
Great minds discuss flops per watt.
https://www.thonking.ai/p/strangely-matrix-multiplications
![Strangely, Matrix Multiplications on GPUs Run Faster When Given "Predictable" Data! [short]](https://www.notion.so/image/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2Ff_auto%2Cq_auto%3Abest%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fthonking.substack.com%252Fapi%252Fv1%252Fpost_preview%252F142508107%252Ftwitter.jpg%253Fversion%253D4?table=block&id=6b245549-7104-431a-ba1e-1bff001cec9c&cache=v2)
New Breakthrough Brings Matrix Multiplication Closer to Ideal | Quanta Magazine
By eliminating a hidden inefficiency, computer scientists have come up with a new way to multiply large matrices that’s faster than ever.
https://www.quantamagazine.org/new-breakthrough-brings-matrix-multiplication-closer-to-ideal-20240307

LLM Matmul free
The method substitutes MatMul with ternary accumulations where the weights are only -1, 0, or +1 with GLU.
Scalable MatMul-free Language Modeling
Matrix multiplication (MatMul) typically dominates the overall computational cost of large language models (LLMs). This cost only grows as LLMs scale to larger embedding dimensions and context...
https://arxiv.org/abs/2406.02528


Seonglae Cho