CommunityNews

The Curse of Recursion: Training on Generated Data Makes Models Forget

The Curse of Recursion: Training on Generated Data Makes Models Forget.
Stable Diffusion revolutionised image creation from descriptive text. GPT-2,
GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of
language tasks. ChatGPT introduced such language models to the general public.
It is now clear that large language models (LLMs) are here to stay, and will
bring about drastic change in the whole ecosystem of online text and images. In
this paper we consider what the future might hold. What will happen to GPT-{n}
once LLMs contribute much of the language found online? We find that use of
model-generated content in training causes irreversible defects in the
resulting models, where tails of the original content distribution disappear.
We refer to this effect as Model Collapse and show that it can occur in
Variational Autoencoders, Gaussian Mixture Models and LLMs. We build
theoretical intuition behind the phenomenon and portray its ubiquity amongst
all learned generative models. We demonstrate that it has to be taken seriously
if we are to sustain the benefits of training from large-scale data scraped
from the web. Indeed, the value of data collected about genuine human
interactions with systems will be increasingly valuable in the presence of
content generated by LLMs in data crawled from the Internet.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

View thread on forum

#recursion #training

0 653 0

2023-06-14 14:12:18 UTC

Where Next?

View thread on forum

recursion

training

Home General Dev>In The News

#recursion #training

0 653 0

Last post

Popular General Dev topics

General Dev>In The News

DOD: Guidance on Software Development and Open Source Software (pdf)

MEMORANDUM FOR SENIOR PENTAGON LEADERSHIP COMMANDANT OF THE COAST GUARD COMMANDERS OF THE COMBATANT COMMANDS DEFENSE AGENCY AND DOD FIEL...

dodcio.defense.gov

#development #pdf

0 1830 0

2022-01-27 14:32:09 UTC

New

General Dev>In The News

A career ending mistake

A career ending mistake — Bitfield Consulting. As software engineers, we’re constantly making detailed, elaborate plans for computers to...

bitfieldconsulting.com

#career

22 1374 8

2022-03-12 13:42:09 UTC

New

General Dev>In The News

22 years of Emacs

How a piece of advice became a lifestyle TABLE OF CONTENTS WHERE TO BEGIN… FIRST CONTACT PICKING EMACS FOR LIFE CHEATING ON EMACS SERE...

arjenwiersma.nl

/emacs

0 1341 0

2022-03-14 15:21:49 UTC

New

General Dev>In The News

50 Shades of Go

50 Shades of Go: Traps, Gotchas, and Common Mistakes for New Golang Devs. Go is a simple and fun language, but, like any other language,...

devs.cloudimmunity.com

/go

1 1221 1

2023-05-27 11:29:17 UTC

New

General Dev>In The News

Two US lawyers fined for submitting fake court citations from ChatGPT

Two US lawyers fined for submitting fake court citations from ChatGPT. Law firm also penalised after chatbot invented six legal cases th...

theguardian.com

#chatgpt

0 1990 3

2024-01-29 11:33:13 UTC

New

General Dev>In The News

Review of Linux on Minisforum V3 AMD Ryzen Tablet

A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...

mudkip.me

#linux #review #amd

0 3361 0

2024-06-24 02:26:38 UTC

New

General Dev>In The News

Software engineering job openings hit five-year low?

Software engineering job openings hit five-year low?. There are 35% fewer software developer job listings on Indeed today, than five yea...

blog.pragmaticengineer.com

#job

5 541 2

2025-02-28 01:20:21 UTC

New

General Dev>In The News

On the benefits of learning in public

On the benefits of learning in public. Learning in public helps me grow as an engineer and seems to benefit others too. Here’s why I sho...

gilesthomas.com

#learning

6 376 5

2025-03-10 03:11:28 UTC

New

General Dev>In The News

olmOCR – Open-Source OCR for Accurate Document Conversion

olmOCR is an open-source tool for converting PDFs to text with high accuracy, preserving reading order and supporting tables, equations, ...

olmocr.allenai.org

2 737 1

2025-03-09 05:08:33 UTC

New

General Dev>In The News

Should managers still code?

Ah, the eternal question, straight from the mailbag.

theengineeringmanager.substack.com

#code

0 424 0

2025-03-13 01:41:39 UTC

New

Other popular topics

Science/Tech>Tech Chat

What are you watching?

Or looking forward to? :nerd_face:

#community

485 12328 258

2025-11-23 04:24:42 UTC

New

General Dev>Dev Chat

Which language or framework do you want to learn next?

Curious to know which languages and frameworks you’re all thinking about learning next :upside_down_face: Perhaps if there’s enough peop...

#community #learning

243 6219 95

2025-06-05 19:34:43 UTC

New

General Dev>Code Editors

Poll: Which code editor do you use?

You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...

#community #polls /vim /emacs #code-editors /vscode #notepad /sublime-text #atom /textmate #codespaces #brackets /onivim #geany

121 5409 61

2025-09-05 00:52:19 UTC

New

Community>Journals

Programming Erlang Book Club

My first contact with Erlang was about 2 years ago when I used RabbitMQ, which is written in Erlang, for my job. This made me curious and...

/erlang /book-programming-erlang-2nd-edition #book-club

195 6634 95

2025-02-16 20:22:17 UTC

New

Backend>Questions

Erlang's not installing on macOS Big Sur "You are natively building Erlang/OTP for a later version of MacOSX than current version"

Just done a fresh install of macOS Big Sur and on installing Erlang I am getting: asdf install erlang 23.1.2 Configure failed. checking ...

#macos /erlang #big-sur #asdf

10 5914 8

2021-01-16 12:33:23 UTC

New

Community>In The Spotlight

Spotlight: Dmitry Zinoviev (Author) Interview and AMA!

Author Spotlight Dmitry Zinoviev @aqsaqal Today we’re putting our spotlight on Dmitry Zinoviev, author of Data Science Essentials in ...

#author-spotlight /python /book-complex-network-analysis-in-python /book-data-science-essentials-in-python /book-resourceful-code-reuse /book-pythonic-programming

33 5041 14

2022-10-11 20:07:10 UTC

New

AI>In The News

How to fix the eyes in AI-generated images

aidemos.info

0 3922 0

2022-09-10 13:54:33 UTC

New

Windows>Chat

Taskbar Overflow Menu (NOT System Tray Overflow)

There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...

#taskbar-overflow-win-11

3 2694 2

2023-02-13 08:43:55 UTC

New

Community>In The Spotlight

AMA with: Mark Volkmann (codebar Winter Lit Fest)

Ask Me Anything with Mark Volkmann @mvolkmann On February 24 and 25, we are giving you a chance to ask questions of PragProg author M...

/book-server-driven-web-apps-with-htmx #codebar-spotlight

37 1880 20

2025-02-26 21:39:39 UTC

New

Backend>Learning Resources

Simplicity

Fight complexity and reclaim the original spirit of agility by learning to simplify how you develop software. The result: a more humane a...

pragprog.com

#pragprog #published-book /book-simplicity

10 4288 8

2025-03-14 21:53:12 UTC

New

General Dev>In The News

Last Issue of "ECMAScript News"

General Dev>In The News

Cloudflare outage should not have happened, and they seem to be missing the point on how to avoid it in the future

General Dev>In The News

Chat Control: EU lawmakers finally agree on the voluntary scanning of your private chats

General Dev>In The News

Bad UX World Cup

General Dev>In The News

A New Bridge Links the Strange Math of Infinity to Computer Science

General Dev>In The News

You can see a Quantum Computer in IBM’s London office

General Dev>In The News

GrapheneOS migrates server infrastructure from France amid police intimidation claims

General Dev>In The News

Pebble Watch Software Is Now 100% Open Source + Tick Talk #4 - PT2 Demos!

General Dev>In The News

How we built a 130,000-node GKE cluster

General Dev>In The News

Booking.com tells woman to pay $17K or lose hotel reservation

General Dev>In The News

General Dev In The News ❯

Latest on Devtalk

The chip made for the AI inference era – the Google TPU

AI>In The News

Same-day upstream Linux support for Snapdragon 8 Elite Gen 5 mobile platform

Linux>In The News

The Input Stack on Linux

Linux>In The News

AI CEO – Replace your boss before they replace you

AI>In The News

Elixir v1.19.4 released!

Backend>Official News

Fara-7B: An Efficient Agentic Model for Computer Use

AI>In The News

Last Issue of "ECMAScript News"

General Dev>In The News

Linux Kernel Explorer

Linux>In The News

The Current State of the Theory that GPL Propagates to AI Models Trained on GPL Code

AI>In The News

We're Losing Our Voice to LLMs

AI>In The News

PostgreSQL: PGDay UK 2026: Registration and calls for papers and sponsors now open!

Backend>Official News

Fable 5.0.0-alpha.17 released!

Frontend>Official News

SynchDB 1.3 Released - FDW-based Snapshot Engine with PostgreSQL 18 and IvorySQL 5 Support

Backend>Official News

PostgreSQL: FOSDEM PGDay 2026 - CfP and registration open!

Backend>Official News

Slop Detective

AI>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

The Curse of Recursion: Training on Generated Data Makes Models Forget

CommunityNews

The Curse of Recursion: Training on Generated Data Makes Models Forget

Where Next?

Popular General Dev topics

DOD: Guidance on Software Development and Open Source Software (pdf)

A career ending mistake

22 years of Emacs

50 Shades of Go

Two US lawyers fined for submitting fake court citations from ChatGPT

Review of Linux on Minisforum V3 AMD Ryzen Tablet

Software engineering job openings hit five-year low?

On the benefits of learning in public

olmOCR – Open-Source OCR for Accurate Document Conversion

Should managers still code?

Other popular topics

What are you watching?

Which language or framework do you want to learn next?

Poll: Which code editor do you use?

Programming Erlang Book Club

Erlang's not installing on macOS Big Sur "You are natively building Erlang/OTP for a later version of MacOSX than current version"

Spotlight: Dmitry Zinoviev (Author) Interview and AMA!

How to fix the eyes in AI-generated images

Taskbar Overflow Menu (NOT System Tray Overflow)

AMA with: Mark Volkmann (codebar Winter Lit Fest)

Simplicity

Sponsor Spotlight

General Dev>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta