KAN

Kolmogorov-Arnold Networks

Effectively prevent

Catastrophic forgetting and alternative of replacing MLP of transformer. Kolmogorov-Arnold Networks are smaller, more interpretable with complex activation functions demand more computational resources. It turns out, that you can write Kolmogorov-Arnold Network as an MLP, with some repeats and shift before ReLU.

efficient-kan

Blealtan • Updated 2025 Oct 21 18:21

The performance issue of the original implementation comes from expanding all intermediate variables to perform the different activation functions. This repository implements a solution to this efficiency gap by formulating all activation functions as a linear combination of a fixed set of basis functions.

[D] Kolmogorov-Arnold Network is just an MLP