CTC

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Dec 18 16:9
Editor
Edited
Edited
2025 Dec 18 16:10
Refs
Refs

Connectionist Temporal Classification

A training and modeling approach designed to learn sequences of different lengths without alignment labels, such as "speech frame sequences (long) → text (short)" where two sequences have different lengths
  • Strong alignment capability: Learns "which frame corresponds to which character" without explicit alignment labels by summing over all possible alignments.
  • Stable and parallelizable training: Easy to compute frame-by-frame (linear + softmax on top of encoder output).
  • However, there are drawbacks: Since each frame prediction is treated almost independently (though the encoder does see context), linguistic context utilization is weak, making it less capable than encoder-decoder models at correcting awkward spelling/word errors.
 
 
 
 
 
 

Recommendations