Deep Equilibrium Models, DEQ
In this equation, is the fixed point where the network's output is fed back into itself as input. In other words, it's a state where repeatedly applying f no longer changes the value. Once we reach this fixed point after iterating (forward step), we can compute the gradient using IFT (Implicit Function Theorem) as follows:
Using this approach, we can calculate the gradient using only the final fixed point, without backpropagating through all recursion steps. → This enables "1-step gradient approximation".
Deep Equilibrium Model
Deep Equilibrium Models
We present a new approach to modeling sequential data: the deep equilibrium model (DEQ). Motivated by an observation that the hidden layers of many existing deep sequence models converge towards...
https://arxiv.org/abs/1909.01377


Seonglae Cho