The QKV structure, residual, and FFN remain the same, but the "spiking neuron" is essentially activation quantization + adaptive sparsity. The "event-driven" approach doesn't actually execute asynchronously on GPUs, it's just a computational simulation. While efficiency is improved, it's not because of "Spiking" but rather due to Linear Attention, MoE, and Quantization. This trend has already been demonstrated in other research (FlashAttention, RWKV, Mamba, Hyena, etc.). The Adaptive threshold is simply mean-based scaling, not a true simulation of the brain's homeostatic firing.
Continuous → Discrete Event
Event-driven computation "only calculates when there is input"