adaptively allocates the parameter budget among weight matrices according to their importance score
effective pruning of unimportant updates, which reduces their parameter budget while circumventing intensive exact SVD computations
Seonglae Cho
Seonglae Cho