Nemotron

Multimodal

cosmos-nemotron-34b Model by NVIDIA | NVIDIA NIM

Multi-modal vision-language model that understands text/img/video and creates informative responses

https://build.nvidia.com/nvidia/cosmos-nemotron-34b

cosmos-nemotron-34b Model by NVIDIA | NVIDIA NIM

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

Nemotron-4 340B, a family of models optimized for NVIDIA NeMo and NVIDIA TensorRT-LLM, includes cutting-edge instruct and reward models, and a dataset for generative AI training.

https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

Mistral Minitron

Mistral-NeMo-Minitron 8B Foundation Model Delivers Unparalleled Accuracy | NVIDIA Technical Blog

Last month, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading state-of-the-art large language model (LLM). Mistral NeMo 12B consistently outperforms similarly sized models on a wide range of…

https://developer.nvidia.com/blog/mistral-nemo-minitron-8b-foundation-model-delivers-unparalleled-accuracy/?ncid=ref-inor-390349/

Hybrid

Mamba Model–Transformer language model with only about 3.2B out of 31.6B total parameters activated for high efficiency

Attention mechanisms become computationally and memory-intensive (especially KV cache) as sequence length increases, whereas Mamba-based models (State Space Models) are structurally designed to scale more efficiently on long sequences. Therefore, using Mamba for most layers makes it easier to gain advantages in throughput/memory such as

Grouped-query Attention

research.nvidia.com

https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Nano-Technical-Report.pdf

Nemotron

Multimodal

Mistral Minitron

Backlinks

Recommendations