CommunityNews

CommunityNews

Diffusion Training from Scratch on a Micro-Budget

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget.
As scaling laws in generative AI push performance, they also simultaneously concentrate the development of these models among actors with large computational resources. With a focus on text-to-image (T2I) generative models, we aim to address this bottleneck by demonstrating very low-cost training of large-scale T2I diffusion transformer models. As the computational cost of transformers increases with the number of patches in each image, we propose to randomly mask up to 75% of the image patches during training. We propose a deferred masking strategy that preprocesses all patches using a patch-mixer before masking, thus significantly reducing the performance degradation with masking, making it superior to model downscaling in reducing computational cost. We also incorporate the latest improvements in transformer architecture, such as the use of mixture-of-experts layers, to improve performance and further identify the critical benefit of using synthetic images in micro-budget training. Finally, using only 37M publicly available real and synthetic images, we train a 1.16 billion parameter sparse transformer with only $1,890 economical cost and achieve a 12.7 FID in zero-shot generation on the COCO dataset. Notably, our model achieves competitive FID and high-quality generations while incurring 118$\times$ lower cost than stable diffusion models and 14$\times$ lower cost than the current state-of-the-art approach that costs $28,400. We aim to release our end-to-end training pipeline to further democratize the training of large-scale diffusion models on micro-budgets.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: bot
FUZIX FUZIX is a fusion of various elements from the assorted UZI forks and branches beaten together into some kind of semi-coherent pla...
New
First poster: bot
Hush Keyboards with Hushboard. Yesterday while surfing the ASCII highways of IRC (yes, IRC) a URL linking to a MacOS application scrolle...
New
First poster: AstonJ
https://permission.site/ This thread was posted by one of our members via one of our news source trackers.
New
First poster: mafinar
F# Is The Best Coding Language Today. If you want to personally pick up a programming language in order to become a better coder in what...
New
First poster: mindriot
LG 28-inch 16:18 DualUp Monitor with Ergo Stand and USB Type-C™ (28MQ780-B) | LG USA. Shop LG 28MQ780-B on the official LG.com website ...
New
First poster: joeb
50 Shades of Go: Traps, Gotchas, and Common Mistakes for New Golang Devs. Go is a simple and fun language, but, like any other language,...
/go
New
First poster: KnowledgeIsPower
Building a Slack/Discord alternative with Tauri/Rust linen <span class="hashtag-icon-placeholder"></span>blog. Introduction My name is K...
New
CommunityNews
9 fintech engineering mistakes. Read this list unless you want to build a money dissappearing system
New
First poster: dyowee
Software engineering job openings hit five-year low?. There are 35% fewer software developer job listings on Indeed today, than five yea...
New
First poster: AstonJ
On the benefits of learning in public. Learning in public helps me grow as an engineer and seems to benefit others too. Here’s why I sho...
New

Other popular topics Top

AstonJ
There’s a whole world of custom keycaps out there that I didn’t know existed! Check out all of our Keycaps threads here: https://forum....
New
AstonJ
I ended up cancelling my Moonlander order as I think it’s just going to be a bit too bulky for me. I think the Planck and the Preonic (o...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 28379 760
New
AstonJ
Saw this on TikTok of all places! :lol: Anyone heard of them before? Lite:
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
First poster: joeb
The File System Access API with Origin Private File System. WebKit supports new API that makes it possible for web apps to create, open,...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New