Delay Pattern

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jun 15 22:51
Editor
Edited
Edited
2025 Jun 15 22:53
Refs
Refs
The RVQ (Residual Vector Quantization) tokenizer needs to predict N codebook stages sequentially.
A type of
Positional Embedding
used to distinguish individual channels. Audio is represented and generated as multiple channels (default 9) of code sequences rather than a single stream. For example, if the first channel generates information at time t, the second channel generates at t - delay[1]. When handling multi-channel audio codes, this defines the rules or mechanisms for how each channel references temporal information to generate the next code.
 
 
 
 
 

Backlinks

Dia

Recommendations