CommunityNews

CommunityNews

Imagen: An AI system that creates photorealistic images from input text

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and image-text alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, and DALL-E 2, and find that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Most Liked

AstonJ

AstonJ

It’s amazing how far AI art has come

A photo of a raccoon wearing an astronaut helmet, looking out of the window at night.

Where Next?

Popular Ai topics Top

First poster: bot
NVIDIA Uses AI to Slash Bandwidth on Video Calls. NVIDIA Research has invented a way to use AI to dramatically reduce video call bandwid...
New
New
First poster: CommunityNews
Google has unveiled a tool that uses artificial intelligence to help spot skin, hair and nail conditions, based on images uploaded by pat...
New
First poster: CommunityNews
Chat-bots are amazing these days! About a month ago LaMDA made the news when it apparently convinced an engineer at Google that it was se...
New
First poster: bot
Not to be outdone by Meta, Google’s AI generator can output 1280x768 HD video at 24 fps.
New
First poster: bot
Ghostwriter generates, completes, or transforms code in 16 languages, similar to GitHub Copilot.
New
First poster: bot
ChatGPT aims to produce accurate and harmless talk—but it’s a work in progress.
New
First poster: CommunityNews
OpenJourney is a Text-to-Image AI model which has the goal of bringing an open source equivalent to Midjourney to the people. It is curre...
New
New
First poster: happyrat1
With a leap in the evolution of large language models, some leading thinkers are questioning whether AI might become sentient
New

Other popular topics Top

AstonJ
If it’s a mechanical keyboard, which switches do you have? Would you recommend it? Why? What will your next keyboard be? Pics always w...
New
DevotionGeo
I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
AstonJ
Thanks to @foxtrottwist’s and @Tomas’s posts in this thread: Poll: Which code editor do you use? I bought Onivim! :nerd_face: https://on...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 28379 760
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
DevotionGeo
I have always used antique keyboards like Cherry MX 1800 or Cherry MX 8100 and almost always have modified the switches in some way, like...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New