dia
nari-labs • Updated 2025 May 30 21:19
(laughs), (clears throat), (sighs), (gasps), (coughs), (singing), (sings), (mumbles), (beep), (groans), (sniffs), (claps), (screams), (inhales), (exhales), (applause), (burps), (humming), (sneezes), (chuckle), (whistles)
Delay Pattern
A type of Positional Embedding used to distinguish individual channels. Audio is represented and generated as multiple channels (default 9) of code sequences rather than a single stream. For example, if the first channel generates information at time t, the second channel generates at t - delay[1]. When handling multi-channel audio codes, this defines the rules or mechanisms for how each channel references temporal information to generate the next code.
Documentation