L3D

Creator
Creator
Seonglae Cho
Created
Created
2025 Apr 21 21:0
Editor
Edited
Edited
2025 Apr 21 21:12

Local Loss Landscape Decomposition

L3D identifies a set of low-rank subnetworks (directions in parameter space of which a subset can reconstruct the gradient of the loss between any sample’s output and a reference output vector). In parameter space, it finds 'subnetwork (circuit)' directions based on the loss function by reconstructing loss gradients between samples and random reference samples as linear combinations of low-dimensional directions (subnetworks). When parameters are manipulated along the reconstructed directions, specific functionalities can be selectively turned on and off like
Steering Vector
The model's parameter space has degrees of freedom (subnetworks) that remain "performance-invariant under multiple directional changes" not only for the entire dataset but also for specific data subsets. This means there are unique "circuits (directions)" hidden for both global and local distributions. These directions in parameter space will be called "subnetworks" from now on - they are units of low-dimensional parameter circuits that activate per sample.
To find these 'local-specific circuits', we decompose the loss gradient between each sample output and random reference outputs. This gradient effectively reveals "parameter directions that only affect this sample". To avoid confusion with the 'loss' mentioned in training, we calculate divergence (kl or mse) based on this gradient.
 
 
 
 
 
 
 
 

Recommendations