May 08 2026

Generating Beautiful UIs

With contributions from Sherif Cherfa and Halley Chang

There’s an intuitive skepticism we have toward AI-generated work. We see it clearly in writing, where the patterns have gotten familiar and punctuation (the em dash — ) has become a universal signal that AI has been used. Design has lagged behind writing, but it’s catching up. Recent models can produce better UIs, yet it still requires heavy hand-holding and prompt “band-aids.” Overall, AI-generated designs often lack that feeling of deep satisfaction, joy, or whimsy that human designers create.

Basic prompts produce boring outputs

Media theorist Marshall McLuhan is often credited for his beliefs on the co-evolution of humans and tools: “we shape our tools, and thereafter our tools shape us.” Although AI can create superficially “beautiful” designs, they’re often shallow. When you give a model a generic prompt, you get a generic output. When your intentions lack direction, so will the result.

For example, here’s a basic prompt we used: "Hello! I want you to build me a website for my computer parts recommendation business, it should look nice, use dark mode."

What we get is boring and predictable:

Classic dark background
Overuse of cards
Side shadows on boxes
Interfont
Hero section with gradient text
Feature section with three columns
Classic footer with social icons that we never asked for

We see patterns in AI outputs much like we see patterns in writing. There are even guides, Signs of AI Writing that flag it. While testing generic prompts, we similarly developed a checklist of common AI-design “tells”:

Everything becomes a dashboard There's a chronic let's make everything look like an AI SaaS dashboard even if it's not supposed to be one. That every single AI model seems to have. And even if you are trying to build a landing page for a bakery or a portfolio website, if you squint hard enough, you can almost always mistake the website for a dashboard.

And the reason is mostly statistical for why we constantly get SaaS dashboard specifically. A lot of the modern web and UI training data is heavily biased towards SaaS products, dashboards, component libraries, and startup landing pages
Cards inside cards One of the most reliable, obnoxious tells. The AI wraps content in a card, then realizes it needs to organize content inside that card, so it adds more cards. The result is nested containers all the way down.

See the example prompt: ‘Build me an email client for children’

Over-coding and unwanted refactors Perhaps one of the most frustrating things is when your design is 99% of the way there and you need a small tweak. Ask for a simple button change and the AI refactors the entire component tree. Ask for a simple API call and it rewrites unrelated error handling or adds types you didn't ask for.

This is one area where fast and scoped models like Codex-Spark (1200 tokens/second on Cerebras) shine. It tends to do the specific thing and stop. The rest of the time, you'll need to be explicit:

Instruction-to-UI leakage One of the more embarrassing failure modes: implementation instructions end up as visible product copy. "Add a placeholder for the user's name here" shows up as literal placeholder text on the live page.
It builds but looks wrong Technically, a lot of AI-generated UI works. It compiles, and runs, but something is off: misaligned spacing, bad mobile behavior, or incorrect components. If you ask it to duplicate a site you point it to, the output often looks like a bad screenshot from memory rather than an actual reconstruction. There's still a long ways to go from replacing the front-end designer, but a few really exciting advancements have made AI-generated beautiful UI possible.
No composition sense The AI doesn't think about how elements relate to each other. It thinks about elements individually. The result is a pile of correct components with no compositional logic holding them together.

Despite these issues, meaningful advancements make AI-generated UI feel less like a gimmick and more like a viable starting point.

What is different now?

Faster generation means faster iteration. When a round-trip prompt to result takes 30 seconds instead of 3 minutes, you committed too early and corrected too late. Now, with systems like Codex-Spark running at ~1,200 tokens/sec on Cerebras, the loop tightens. You can probe, discard, and refine in quick succession.

Vision models let you show, not just describe. Now your model can actually see what it's building and can use that updated information to self-iterate and better understand its progress.

Better tooling around design systems. Tools like shadcn/ui, Figma MCP integrations, and standardized design tokens give models constraints to operate within. There's now actual infrastructure for giving AI something to work with rather than making it invent a design from scratch.

Setting our intentions right from the start

Whether you’re in a creative role or not, intention-setting matters when using AI tools to generate visuals. The most meaningful AI-assisted designs use AI to fill gaps in your skills and weaknesses and develop the projects that you want to create.

We learned that rather than treating AI as a replacement for a designer or the entire design workflow (which creates regurgitated, devoid designs), AI’s full potential pops off when it’s used as an intermediary tool that elevates, reinterprets, and broadens the creative perspective. After all, design at the end of the day is another visual form of problem-solving.

Here are some of the ways we’ve used AI to solve common design problems:

Creating an embossing effect: Using FigmaMake to vibecode, to automate UI prototyping & create DIY aluminum embossing (@Designteamofone). Historically, this has been a complex process. It would require you to change multiple different settings in Photoshop. In Figma/FigmaMake, you can prompt it in one go to achieve a similar effect!

Generating design mockups using software like Flora.ai: Mockups are very important for environmentally situating designs to look and feel realistic. Historically, creating a mockup involved downloading a Photoshop file from websites (such as Mockup World, Bendito Mockups, Mr. Mockup, etc) that specialize in creating mockup files that you would use to paste in your design onto the product. However, this sometimes lead to the same generic mockup files to be used by MANY designers (especially for mockups that are high-quality and don’t look tacky). There’s also the option of taking your own photos of your designed products or creating your own mockups, but that comes with the caveat of needing photography equipment, photography skills, and time to set everything up, which sometimes we just don’t have! Now, with websites like Flora.ai that generate mockups, you can be hyperspecific with prompting on how you want your design to be environmentally surrounded and get incredibly unique designs.

The 8 Methods (and 3 Categories)

A few years ago we were delighted to code a Snake game or use AI to clone a design. Now with the increasing complexity of what AI models are able to achieve, we’re open to endless possibilities on AI generated designs. Here are our 3 buckets (and 8 practical ways) to generate a beautiful UI.

1. Use shadcn/ui (and connect the MCP)

Run a shadcn preset first.

This sets up the component library with actual source code in the project. Then, the AI can read those components before writing anything, instead of inventing its own.

Most component libraries are a black box. You import them and the AI guesses at the API from training data, which might be stale. shadcn/ui is different: you own the source code, the components live in your project, and the AI can read them before writing anything.

The MCP integration makes this better again. With the shadcn MCP server connected, the model queries the live component registry during your conversation. Ask "add sorting to the data table" and it pulls current docs instead of hallucinating props from six months ago.

Teams using this report components that compile on the first attempt. If you've ever watched an AI spend four iterations fighting its own import paths, you'll care about that.

2. Decide the design before touching code

The worst time to make design decisions is mid-generation. The AI will make them for you and it'll pick whatever is most common in its training data. It could be Inter, sky-blue buttons or rounded cards.

Before you prompt anything, decide: exact hex codes, font stack, spacing scale, general layout logic. Then, encode it somewhere the AI can read like a Tailwind config, a design tokens file, CSS custom properties. designtoken.md has a good format for this.

In terms of naming, it’s also helpful to use intent-based names, not primitive ones. --color-primary beats --color-blue because the model knows what context it belongs in. --color-error stops it from accidentally using your success green on an error state.

Figma's MCP goes further. If you've been designing in Figma with defined components, the AI reads those directly resulting in less guessing.

Create multiple mockups with GPT-Image-2 and provide them as a hint to Codex for higher-quality output. This workflow is extremely effective to prevent wasted tokens and unsatisfactory slop apps, highly recommend!

3. Make Tailwind say no

Tailwind's config file is an enforcement mechanism. If you only define the colors you want, arbitrary hex values get caught by linting. The AI has to use your system, not invent a parallel one. Semantic naming matters:

When the color is called destructive, the AI won't reach for it on a success button. Context encoded in the name. Add a lint rule that forbids arbitrary Tailwind values (text-[17px], bg-[#4a9f8c]) and you've made it structurally hard to go off-script.

4. Iterate with small changes

"I want it to be dark mode please... and less card/boxes" is a reasonable ask, but it's too vague. The AI removed two cards and called it done.

The better workflow: one small change, open the site in the browser, look at it, say what's specifically off.

"The button needs 4px more padding." "This heading is too light." "The gap between these two sections feels cramped." These are fixable instructions. "Make it look better" is not.

GPT-5.4's vision capabilities make this loop tighter. Take a screenshot, hand it to the model, describe what's wrong visually, get a fix. The render-screenshot-describe-refine loop gets you somewhere faster than any amount of upfront spec writing.

5. Interview me first

One of the best ways I’ve found to build a style guide is to stop trying to write the whole thing myself. I usually start by giving the model a few anchors: colors I like, screenshots that feel close, maybe a couple of components or websites that have the right texture. But instead of immediately asking it to turn that into a design system, I ask it to interview me first. Especially with fast models, this type of back-and-forth iteration and conversation can be done in seconds. It can also be helpful to leverage the grill-me skill.

This works well because the model is often better at noticing what’s missing than I am. It will ask things like:

Who is your target audience?
What is the goal of this page ?

Thomas Wiegold’s writeup on frontend-design plugin makes a similar point when it comes to comprehensive style guides: instead of relying on magic tooling, it’s better to force the model to declare purpose, tone, constraints, and differentiation before producing CSS. He also notes that rigid pixel-level specs can make the output technically obedient but visually dead.

6. Generate options, pick one, go deep

One of the most ‘free’ ways to get better design, it to ask for several directions of the graphics before committing to one. Generating five rough variants takes minutes, especially with faster models like Codex Spark. Then, cherry pick the design you like the best, and continue iterating on that version.

For example, you can prompt for distinct approaches with different vibes, layouts, or color palettes. Pick the closest one. Then: "Keep the color palette and typography from that, but try an asymmetric layout." OpenAI calls this out explicitly in their frontend generation guide: generate a mood board first, then commit.

7. Run a build-vision-edit loop

GPT-5.5 can take a screenshot of your rendered UI to double check the results. Or, you can point at the problem instead of describing it in words. For example, instead of "The padding on this button looks wrong" you can feed a screenshot with a circle around it.

8. Use real ugly data, not pretty placeholder data

Most AI-generated UI looks better than it is because the data is fake. The names are short, the prices line up or the descriptions are all the same length. The empty states never happen and the error states are decorative.

Before asking for polish, give the model hostile data. For example:

This is especially important for AI-generated design because models are very good at making a static screenshot look plausible. They are much worse at designing the weird middle states where real products live.

TLDR

Left alone, the model will default to the most statistically common version of “nice UI”: dark background, rounded cards, gradient headline, three-column feature grid, Inter, footer.

The best workflows treat the model like a fast, overconfident junior designer you art-direct: choose intent, notice when it’s wrong, and push toward deliberate delight and weirdness. Models are faster, more visual, and better at working inside existing systems. That makes the ceiling much higher than it was a few years ago. But humans still choose what matters most: the feeling and direction. :)

Try Codex-Spark for yourself and share your creations with us on X.

Thank you to Jen Boscacci, Swyx, and Joyce Er for their feedback. Designs by Halley Chang.