Multimodality and Large Multimodal Models (LMMs)
For a long time, each ML model operated in one data mode – text (translation, language modeling), image (object detection, image classification), or audio (speech recognition).
https://huyenchip.com/2023/10/10/multimodal.html