Skip to main content

The world’s fastest GLM-4.6 is now available on Cerebras at 1,000 TPS >>

Pricing

Inference Api access

Free

The easiest way to get started with Cerebras.

  • Access to all Cerebras powered models​
  • The world’s fastest inference – 20x faster than OpenAI and Anthropic​
  • Community support via Discord
Get Api Key

Developer

Generous rate limits for power users​

  • Everything in Free
  • Self-serve payment starting at just $10​
  • 10x higher rate limits than free tier​
  • Higher priority processing
Get Api Key

Enterprise

Highest throughput, custom weights, and guaranteed uptime.​

  • Everything in Free and Developer
  • Highest rate limits for production workloads​
  • Lowest latency with dedicated queue priority​
  • Support for custom model weights​
  • Model fine-tuning and training services​
  • Dedicated support team with response time guarantees​
Contact Sales

Cerebras code

Cerebras Code Pro
$50/month

Experience instant code completions with frontier models

  • Top open source model access with fast, high-context completions.
  • Send up to 24 million tokens/day ($48/day worth of value)
  • Ideal for indie devs, simple agentic workflows, and weekend projects.
Sign up

Max
$200/month

Built for teams running demanding workloads at scale

  • Top open source model for heavy coding workflows.
  • Send up to 120m tokens/day ($240/day worth of value)
  • Ideal for full-time development, IDE integrations, code refactoring, and multi-agent systems.
Sign up

Developer tier pricing

*Preview models are intended for evaluation purposes only, and are not intended for use in production environments. They may be discontinued at short notice.

**Models have been scheduled for deprecation as part of ongoing efforts to serve the most up-to-date models.

Partners

Get access to Cerebras Inference through our partner APIs​