Earth mover's distance, Optimal Transport Problem
The minimum cost required to transform one probability distribution into another by moving probability mass, where cost is defined as the total distance that mass needs to be moved
It does not diverges even if Support does not duplicate.
Flow
Algorithm
NTIL (Numerical Token Integrity Loss)
- Token level: Preserves order/distance between numbers by training with EMD (+ digit position weighting)
- Sequence level: Measures value error (relative·scale) between predicted and target numerical values
CE limitation: Treats each numerical token as an independent class, ignoring proximity between numbers