Web Video Text Tracks Format (WebVTT) - Web APIs | MDN
Web Video Text Tracks Format (WebVTT) is a format for displaying timed text tracks (such as subtitles or captions) using the <track> element. The primary purpose of WebVTT files is to add text overlays to a <video>. WebVTT is a text based format, which must be encoded using UTF-8. Where you can use spaces you can also use tabs. There is also a small API available to represent and manage these tracks and the data needed to perform the playback of the text at the correct times.
https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API