SpikeBrain

The QKV structure, residual, and FFN remain the same, but the "spiking neuron" is essentially activation quantization + adaptive sparsity. The "event-driven" approach doesn't actually execute asynchronously on GPUs, it's just a computational simulation. While efficiency is improved, it's not because of "Spiking" but rather due to Linear Attention, MoE, and Quantization. This trend has already been demonstrated in other research (FlashAttention, RWKV, Mamba, Hyena, etc.). The Adaptive threshold is simply mean-based scaling, not a true simulation of the brain's homeostatic firing.

Continuous → Discrete Event

Event-driven computation "only calculates when there is input"

Adaptive Threshold Neuron

Linear Attention

arxiv.org

https://arxiv.org/pdf/2509.05276

SpikeBrain

Continuous → Discrete Event

Adaptive Threshold Neuron

Linear Attention

Backlinks

Recommendations