Qwen 2.5-Turbo
Extending the Context Length to 1M Tokens!
API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction After the release of Qwen2.5, we heard the community’s demand for processing longer contexts. In recent months, we have made many optimizations for the model capabilities and inference performance of extremely long context. Today, we are proud to introduce the new Qwen2.5-Turbo version, which features: Longer Context Support: We have extended the model’s context length from 128k to 1M, which is approximately 1 million English words or 1.
https://qwenlm.github.io/blog/qwen2.5-turbo/
Qwen2.5-1M tokens Language Model Context
Qwen2.5-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens
Tech Report HuggingFace ModelScope Qwen Chat HuggingFace Demo ModelScope Demo DISCORD Introduction Two months after upgrading Qwen2.5-Turbo to support context length up to one million tokens, we are back with the open-source Qwen2.5-1M models and the corresponding inference framework support. Here’s what you can expect from this release: Opensource Models: We’re releasing two new checkpoints, Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M, marking the first time we’ve upgraded our opensource Qwen models to handle 1M-token contexts.
https://qwenlm.github.io/blog/qwen2.5-1m/
Max
Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model
QWEN CHAT API DEMO DISCORD It is widely recognized that continuously scaling both data size and model size can lead to significant improvements in model intelligence. However, the research and industry community has limited experience in effectively scaling extremely large models, whether they are dense or Mixture-of-Expert (MoE) models. Many critical details regarding this scaling process were only disclosed with the recent release of DeepSeek V3. Concurrently, we are developing Qwen2.
https://qwenlm.github.io/blog/qwen2.5-max/

Seonglae Cho