May 02 2025

Cerebras April Highlights

What a month! Meta Llama API, DARPA, EPCC, and Supernova

Meta launched their Llama API with Cerebras to deliver 18x faster inference for real-time voice, agents, and reasoning.
Join us for SUPERNOVA on May 15 in SF. Live demos, LLM insights, and what’s next in AI.
DARPA selects Cerebras to power AI at the edge—bringing high-performance inference outside the datacenter.
EPCC Joins the Fast Lane: University of Edinburgh brings CS-3 online to accelerate frontier research in the UK

Let's dive in 🌻

Meta x Cerebras: The World's Fastest Inference Comes to Llama API

Meta has launched the new Llama API with Cerebras as its fast inference partner, unlocking groundbreaking opportunities for a global developer audience. With the world’s most popular open-source models now running up to 18x faster than traditional GPU solutions, developers can build a new generation of applications—think real-time voice assistants, instant AI agents, sub-second reasoning, and beyond. This lightning-fast performance isn’t just faster—it’s transformational.

VentureBeat wrote “Meta is in a unique position with 3 billion users, hyper-scale datacenters, and a huge developer ecosystem..the integration of Cerebras technology helps Meta leapfrog OpenAI and Google in performance by approximately 20x.” -

Try the world's fastest inference

Join us on May 15 in SF for SUPERNOVA, our flagship AI conference for building realtime agents, reasoning, and chat. Be the first to access SOTA models at the fastest speeds, only on Cerebras.

This event is for applied AI leaders, startups, and researchers. Highlights include:

Real-world examples of high-speed LLM deployments from Meta, GSK, Mayo Clinic, Perplexity, AlphaSense, Mistral
How to run the latest DeepSeek, Llama, and Qwen models at 20x speed
Cutting-edge demos showcasing ultra-fast inference for real-time AI, fast reasoning, advanced agents & more
Hear about the future of AI compute from Cerebras CEO Andrew Feldman in an exclusive keynote

See why speed is your competitive advantage in this new AI era.

Space is limited.

RSVP Now

As a part of the DARPA initiative, Cerebras is partnering with Ranovus to build a breakthrough HPC system—combining wafer-scale compute and co-packaged optics.

Our Wafer-Scale Engine delivers 7,000x more memory bandwidth than GPUs. With integrated optics, we’re now tackling the communication bottleneck—unlocking real-time AI at the edge, battlefield simulations, and next-gen military and commercial robotics.

Cerebras and the University of Edinburgh have partnered to launch the largest Cerebras cluster in Europe — a four-system CS-3 supercomputing powerhouse now live at EPCC, the University’s supercomputing centre and part of the Edinburgh International Data Facility.

This milestone represents a major leap forward for AI and supercomputing in the UK.

Follow us to get on the list for 2025 events

Build with us LinkedIn X Discord