Cerebras

Creator

Creator

Seonglae Cho

Created

Created

2024 Jun 14 12:58

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Oct 29 1:5

Refs

Refs

2100tokens /s

Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s - Cerebras

[vc_row][vc_column column_width_percent=”90″ gutter_size=”3″ overlay_alpha=”50″ shift_x=”0″ shift_y=”0″ shift_y_down=”0″ z_index=”0″ medium_width=”0″ mobile_width=”0″ width=”1/1″ uncode_shortcode_id=”158708″][vc_column_text uncode_shortcode_id=”182361″]Today we’re announcing the biggest update to Cerebras […]

https://cerebras.ai/blog/cerebras-inference-3x-faster

Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s - Cerebras

Just as like
Groq AI is focusing on
AI Server API serving

Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world’s fastest’ AI inference service

Cerebras Systems throws down gauntlet to Nvidia with launch of 'world's fastest' AI inference service - SiliconANGLE

Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world’s fastest’ AI inference service

https://siliconangle.com/2024/08/27/cerebras-systems-throws-down-gauntlet-to-nvidia-launch-of-worlds-fastest-ai-inference-service/

Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world’s fastest’ AI inference service

Giant Chips Give Supercomputers a Run for Their Money

This dinner-plate sized chip can cut the energy cost of inference by a factor of three for large language models like ChatGPT. It can also do scientific calculations in the field of molecular dynamics at an unprecedented speed, allowing for simulation of new regimes relevant for fusion energy.

Giant Chips Give Supercomputers a Run for Their Money

https://spectrum.ieee.org/cerebras-wafer-scale-engine

Giant Chips Give Supercomputers a Run for Their Money

Recommendations

//////