CommunityNews

CommunityNews

Imagen: An AI system that creates photorealistic images from input text

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and image-text alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, and DALL-E 2, and find that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

3 961 1

Most Liked

AstonJ

AstonJ

It’s amazing how far AI art has come

A photo of a raccoon wearing an astronaut helmet, looking out of the window at night.

Where Next?

Popular Ai topics Top

First poster: bot
The new suite is composed of four products that cover endpoint protection, endpoint detection and response, mobile threat defense, and us...
0 829 0
New
New
First poster: bot
Within the decade, Google aims to build a useful, error-corrected quantum computer. This will accelerate solutions for some of the world’...
0 659 0
New
CommunityNews
Artificial intelligence is now smart enough to write tracks that earn streaming service royalties.
4 807 1
New
First poster: bot
It hopes the platform will let firms train drones and air taxis virtually without risking crashes.
0 438 0
New
CommunityNews
Blake Lemoine went public with his beliefs that Google’s breakthrough Lamda technology is sentient.
2 418 1
New
First poster: bot
In the second part of this three-part series, our heart attack predictions take flight.
0 429 0
New
First poster: bot
Ghostwriter - Code faster with AI. An AI pair programmer that helps you write better code, faster.
0 536 0
New
First poster: bot
BBC documentary used face-swapping AI to hide protesters’ identities. Filmmakers used an AI to swap the faces of anti-government protest...
0 494 0
New
First poster: bot
Ai solves Advent of Code 2022. I will try to solve all of the advent of code with as little human input as possible
0 481 0
New

Other popular topics Top

DevotionGeo
I know that -t flag is used along with -i flag for getting an interactive shell. But I cannot digest what the man page for docker run com...
7 7340 2
New
Rainer
My first contact with Erlang was about 2 years ago when I used RabbitMQ, which is written in Erlang, for my job. This made me curious and...
195 6396 95
New
AstonJ
We have a thread about the keyboards we have, but what about nice keyboards we come across that we want? If you have seen any that look n...
49 5284 39
New
AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
14 6073 7
New
PragmaticBookshelf
“Finding the Boundaries” Hero’s Journey with Noel Rappin @noelrappin Even when you’re ultimately right about what the future ho...
34 3841 21
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
52 4894 22
New
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
62 6351 19
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
72 3959 21
New
CommunityNews
A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...
0 1782 0
New