What a month! Meta Llama API, DARPA, EPCC, and Supernova
- Meta launched their Llama API with Cerebras to deliver 18x faster inference for real-time voice, agents, and reasoning.
- Join us for SUPERNOVA on May 15 in SF. Live demos, LLM insights, and what’s next in AI.
- DARPA selects Cerebras to power AI at the edge—bringing high-performance inference outside the datacenter.
- EPCC Joins the Fast Lane: University of Edinburgh brings CS-3 online to accelerate frontier research in the UK
Let's dive in 🌻
Meta x Cerebras: The World's Fastest Inference Comes to Llama API
Meta has launched the new Llama API with Cerebras as its fast inference partner, unlocking groundbreaking opportunities for a global developer audience. With the world’s most popular open-source models now running up to 18x faster than traditional GPU solutions, developers can build a new generation of applications—think real-time voice assistants, instant AI agents, sub-second reasoning, and beyond. This lightning-fast performance isn’t just faster—it’s transformational.
VentureBeat wrote “Meta is in a unique position with 3 billion users, hyper-scale datacenters, and a huge developer ecosystem..the integration of Cerebras technology helps Meta leapfrog OpenAI and Google in performance by approximately 20x.” -
Join us on May 15 in SF for SUPERNOVA, our flagship AI conference for building realtime agents, reasoning, and chat. Be the first to access SOTA models at the fastest speeds, only on Cerebras.
This event is for applied AI leaders, startups, and researchers. Highlights include:
- Real-world examples of high-speed LLM deployments from Meta, GSK, Mayo Clinic, Perplexity, AlphaSense, Mistral
- How to run the latest DeepSeek, Llama, and Qwen models at 20x speed
- Cutting-edge demos showcasing ultra-fast inference for real-time AI, fast reasoning, advanced agents & more
- Hear about the future of AI compute from Cerebras CEO Andrew Feldman in an exclusive keynote
See why speed is your competitive advantage in this new AI era.
Space is limited.
As a part of the DARPA initiative, Cerebras is partnering with Ranovus to build a breakthrough HPC system—combining wafer-scale compute and co-packaged optics.
Our Wafer-Scale Engine delivers 7,000x more memory bandwidth than GPUs. With integrated optics, we’re now tackling the communication bottleneck—unlocking real-time AI at the edge, battlefield simulations, and next-gen military and commercial robotics.
Cerebras and the University of Edinburgh have partnered to launch the largest Cerebras cluster in Europe — a four-system CS-3 supercomputing powerhouse now live at EPCC, the University’s supercomputing centre and part of the Edinburgh International Data Facility.
This milestone represents a major leap forward for AI and supercomputing in the UK.