Multimodal
cosmos-nemotron-34b Model by NVIDIA | NVIDIA NIM
Multi-modal vision-language model that understands text/img/video and creates informative responses
https://build.nvidia.com/nvidia/cosmos-nemotron-34b

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models
Nemotron-4 340B, a family of models optimized for NVIDIA NeMo and NVIDIA TensorRT-LLM, includes cutting-edge instruct and reward models, and a dataset for generative AI training.
https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/

Mistral Minitron
Mistral-NeMo-Minitron 8B Foundation Model Delivers Unparalleled Accuracy | NVIDIA Technical Blog
Last month, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading state-of-the-art large language model (LLM). Mistral NeMo 12B consistently outperforms similarly sized models on a wide range of…
https://developer.nvidia.com/blog/mistral-nemo-minitron-8b-foundation-model-delivers-unparalleled-accuracy/?ncid=ref-inor-390349/

HybridMamba Model–Transformer language model with only about 3.2B out of 31.6B total parameters activated for high efficiency
Attention mechanisms become computationally and memory-intensive (especially KV cache) as sequence length increases, whereas Mamba-based models (State Space Models) are structurally designed to scale more efficiently on long sequences. Therefore, using Mamba for most layers makes it easier to gain advantages in throughput/memory such as Grouped-query Attention
research.nvidia.com
https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Nano-Technical-Report.pdf

Seonglae Cho