now upgraded with glm 4.6

THE FASTEST WAY TO CODE WITH AI

Stop waiting on your model. Cerebras runs GLM 4.7 — the best-in-class model for code generation, at 1,000 tokens+ per second — so you can stay in flow.

Try Now

State of the Art Frontier Model

GLM-4.7 is one of the world’s top open coding models: #1 for tool calling on the Berkeley Function Calling Leaderboard and on par with Sonnet 4.5 in web-dev performance.

Bring Your Own AI Code Editor

Use Cerebras Code Pro with any AI-friendly editor or agent that accepts your API key. Works out of the box with Cline, OpenCode, Crush, and more. Integrate instantly and code without switching tools.

Free

$0

GLM 4.7 access with limited tokens and requests.
Great for trying out Cerebras inference or building a small demo in your favorite AI Code Editor.

sold out

Pro

$50

GLM4.7 access with fast, high-context completions. Send up to 24million tokens per day, enough for 3–4 hours of uninterrupted vibe coding.

Ideal for indie devs, simple agentic workflows, and weekend projects.

sold out

Max

$200

GLM4.7 access for heavy coding workflows. Send up to 120 million tokens/day.

Ideal for full-time development, IDE integrations, code refactoring, and multi-agent systems.

sold out

Follow us on X and subscribe to our newsletter to be the first to find out when new plans drop.

Get Updates

Newsletter signup

Company

News

Insights

Performance comparisons are based on third-party benchmarking or internal testing. Observed inference speed improvements versus GPU-based systems may vary depending on workload, configuration, date and models being tested.

info@cerebras.ai

1237 E. Arques Ave  Sunnyvale, CA 94085

THE FASTEST WAY TO CODE WITH AI

State of the Art Frontier Model

Bring Your Own AI Code Editor

Free

$0

Pro

$50

Max

$200

Follow us on X and subscribe to our newsletter to be the first to find out when new plans drop.​

Follow us on X and subscribe to our newsletter to be the first to find out when new plans drop.