Aug 01 2025

Qwen3 Coder 480B is Live on Cerebras

Alibaba's Qwen3 Coder 480B Instruct model is now available on Cerebras. Qwen3 Coder is one of the top coding models in the world with coding ability that rivals Claude 4 Sonnet and Gemini 2.5. Running on the Cerebras Wafer Scale Engine, Qwen3 Coder reaches an unprecedented 2,000 tokens per second. Coding problems that take 20 seconds on Sonnet 4 finish in just one second on Cerebras. To make Qwen3 Coder widely accessible, we are also launching Cerebras Code – two monthly subscription plans with generous rate limits at $50 and $200 per month.

Just two weeks after launch, Alibaba’s Qwen3 Coder 480B has soared in adoption, reaching #2 in OpenRouter’s coding model leaderboard, overtaking Gemini 2.5, DeepSeek V3, Kimi K2, and Claude 4 Opus. It’s widely praised as the first model that matches Claude 4 Sonnet – the industry’s leading coding model – in accuracy and dependability in real world software engineering tasks.

Cerebras is proud to take the world’s leading open-weight coding model and turbocharge it to 2,000 tokens per second. That means developers can generate 1,000 lines of JavaScript in just 4 seconds, versus 30 seconds on Gemini 2.5 Flash or 80 seconds on Claude 4 Sonnet. Instant code-gen returns developers to flow-state, eliminating the painful start-stop cadence of slow, GPU based code generation.

Cline – the leading coding agent for VS Code – is a great way to use Cerebras Inference. Simply select ‘Cerebras’ in the API Provider dropdown and select qwen-3-coder-480b as the model. Cline takes high level commands and is able to one-shot webapps or make sophisticated changes to existing projects.

Qwen3 480B is available today on Cerebras Inference Cloud and our partners OpenRouter and HuggingFace at $2 per million input or output token. We serve the model from our US based data-centers with 131K context, FP8 precision, and zero data retention.

To make instant AI coding widely accessible, we are launching two monthly subscription plans – Cerebras Code Pro at $50/m and Cerebras Code Max at $200/m. These plans offer equal or higher rate limits than comparable plans from Cursor and Anthropic while giving you 20x higher coding speed.

Try Qwen3 Coder