kammy

kammy

Benchmarking AI Design (+ the quirks of AI generated UI right now)

Hi everyone!

The other day I was having a debate with my friends about whether or not the top LLM models are “good at design.” I’d love to hear other people’s thoughts & experiences, as well as what models (if any) people use to help them improve their interfaces if they don’t have access to a designer.

I’m also curious about if anyone has noticed interesting trends across models or within specific models on design. For example, I can’t help but notice DeepSeek & Anthropic models love gradient purple titles & backgrounds, and OpenAI has a serious issue with putting white text on white backgrounds / unreadable colors.

I also built a website where people can compare different models on the same design prompts, creating a crowdsourced leaderboard. Would love to hear thoughts & reactions

Where Next?

Popular Ai topics Top

AstonJ
This video about multi-agent AI is a really nice watch - it only took them a few million tries to master certain strategies - doing much ...
#ai
New
AstonJ
Can you spot the AI generated person in the pic below? ▶ Spoiler Video here:
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
AstonJ
Loads of news stories about DeepSeek here in the last few days, no surprise as it’s been making headlines across the world! Currently a h...
New
AstonJ
I have a feeling we’re going to see a lot of threads about DeepSeek, so have put up a portal for it :003:
New
xiji2646-netizen
Just went through the Anthropic migration guide for Opus 4.7 and there are more gotchas than the announcement implied. Curious if others ...
New
xiji2646-netizen
Anthropic launched Claude Design this week and there’s a lot of noise about the generation demos and the stock reaction. But the feature ...
New
xiji2646-netizen
Google just dropped a significant Deep Research upgrade: collaborative planning, multi-tool orchestration (MCP servers, Code Execution, F...
New
xiji2646-netizen
DeepSeek just released V4 and the pricing is hard to ignore. V4-Flash: $0.28/M output tokens. V4-Pro: $2.19/M. Both with 1M token contex...
New
xiji2646-netizen
Been using Claude Code on Max for a few weeks and kept running into rate limits by early afternoon. Same tasks as colleagues who weren’t ...
New

Other popular topics Top

PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
PragmaticBookshelf
From finance to artificial intelligence, genetic algorithms are a powerful tool with a wide array of applications. But you don't need an ...
New
Exadra37
I am asking for any distro that only has the bare-bones to be able to get a shell in the server and then just install the packages as we ...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
mafinar
This is going to be a long an frequently posted thread. While talking to a friend of mine who has taken data structure and algorithm cou...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
RobertRichards
Hair Salon Games for Girls Fun Girls Hair Saloon game is mainly developed for kids. This game allows users to select virtual avatars to ...
New
PragmaticBookshelf
Use advanced functional programming principles, practical Domain-Driven Design techniques, and production-ready Elixir code to build scal...
New