CommunityNews

Reinforcement Pre-Training

In this work, we introduce Reinforcement Pre-Training (RPT) as a new scaling paradigm for large language models and reinforcement learning (RL). Specifically, we reframe next-token prediction as a reasoning task trained using RL, where it receives verifiable rewards for correctly predicting the next token for a given context. RPT offers a scalable method to leverage vast amounts of text data for general-purpose RL, rather than relying on domain-specific annotated answers. By incentivizing the capability of next-token reasoning, RPT significantly improves the language modeling accuracy of predicting the next tokens. Moreover, RPT provides a strong pre-trained foundation for further reinforcement fine-tuning. The scaling curves show that increased training compute consistently improves the next-token prediction accuracy. The results position RPT as an effective and promising scaling paradigm to advance language model pre-training.

Read in full here:

View thread on forum

#training

0 409 0

2025-06-11 02:01:27 UTC

Where Next?

View thread on forum

training

Home AI>In The News

#training

0 409 0

Last post

Popular Ai topics

AI>In The News

AI Teaches Itself Diplomacy

Now that DeepMind has taught AI to master the game of Go—and furthered its advantage in chess—they’ve turned their attention to another b...

spectrum.ieee.org

0 1327 0

2021-03-06 13:40:54 UTC

New

AI>In The News

Unveiling our new Quantum AI campus

Within the decade, Google aims to build a useful, error-corrected quantum computer. This will accelerate solutions for some of the world’...

blog.google

#google #quantum #tech-giants

0 865 0

2021-05-20 19:30:14 UTC

New

AI>In The News

DeepMind’s AI helps untangle the mathematics of knots

DeepMind’s AI helps untangle the mathematics of knots. The machine-learning techniques could benefit other areas of maths that involve l...

nature.com

#deepmind #mathematics

0 903 0

2021-12-11 05:49:46 UTC

New

AI>In The News

Nvidia R&D chief on how AI is improving chip design

Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and se...

hpcwire.com

#nvidia #design

0 1040 0

2022-04-20 14:08:47 UTC

New

AI>In The News

Can You Distinguish Daniel Dennett from a Computer?

Chat-bots are amazing these days! About a month ago LaMDA made the news when it apparently convinced an engineer at Google that it was se...

schwitzsplinters.blogspot.com

0 1198 0

2022-07-28 14:47:47 UTC

New

AI>In The News

Hyundai announces $400M AI, robotics institute powered by Boston Dynamics

When Hyundai acquired Boston Dynamics at the end of 2020, there were plenty of open questions. Chief among them was why we should assume ...

techcrunch.com

#robotics

0 788 0

2022-08-15 13:27:08 UTC

New

AI>In The News

OpenAI debuts DALL-E API so devs can integrate its AI artwork into their apps

OpenAI offers integrated AI image generation on a demand—for 2 cents an image.

arstechnica.com

#apps #api #artwork

0 747 0

2022-11-04 00:29:13 UTC

New

AI>In The News

Klarna CEO says the company stopped hiring a year ago because AI 'can already do all of the jobs'

Klarna CEO says the company stopped hiring a year ago because AI ‘can already do all of the jobs’. Klarna CEO Sebastian Siemiatkowski sa...

businessinsider.com

/erlang #jobs #klarna

2 627 2

2024-12-24 16:46:22 UTC

New

AI>In The News

Developer survey shows trust in AI coding tools is falling as usage rises

“AI solutions that are almost right, but not quite” lead to more debugging work.

arstechnica.com

#coding

11 719 9

2025-08-20 15:35:32 UTC

New

AI>In The News

LLM Leaderboard - Comparison of over 100 AI models from OpenAI, Google, DeepSeek & others | Artificial Analysis

Comparison and ranking the performance of over 100 AI models (LLMs) across key metrics including intelligence, price, performance and spe...

artificialanalysis.ai

#google #artificial #openai #llm /deepseek

0 638 0

2025-08-01 14:49:37 UTC

New

Other popular topics

General Dev>Dev Chat

HELLO WORLD (Introductions thread!)

Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:

#community

481 6447 116

2025-11-06 03:57:03 UTC

New

Backend>Chat

Would you use Erlang now when there is Elixir?

Why, if your answer is yes?

/elixir /erlang

167 4700 52

2021-04-22 18:15:44 UTC

New

General Dev>Hardware

Poll: Which keyboard layout do you use?

poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...

colemakmods.github.io

#polls /keyboards

10 5701 11

2020-10-31 23:12:33 UTC

New

General Dev>Hardware

GMK Serika Keycaps - Serika 2 available to order now!

I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...

/keyboards #keycaps #mechanical-keyboards

9 4719 7

2020-12-05 21:32:30 UTC

New

Data Science

Can AI/ML predict a lottery win?

Biggest jackpot ever apparently! :upside_down_face: I don’t (usually) gamble/play the lottery, but working on a program to predict the...

#ai #machine-learning

19 3496 10

2021-10-18 19:01:41 UTC

New

Android>Questions

Clipboard readtext not working in android webview

Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...

#android #clipboard

1 4678 0

2022-09-27 18:52:03 UTC

New

General Dev>Learning Resources

A Common-Sense Guide to Data Structures and Algorithms in Python, Volume 1

Big O Notation can make your code faster by orders of magnitude. Get the hands-on info you need to master data structures and algorithms ...

pragprog.com

#pragprog /python #published-book /book-a-common-sense-guide-to-data-structures-and-algorithms-in-python-volume-1

24 3803 11

2024-01-29 15:52:29 UTC

New

General Dev>In The News

X can’t stop spread of explicit, fake AI Taylor Swift images

Will Swifties’ war on AI fakes spark a deepfake porn reckoning?

arstechnica.com

/swift

0 7404 0

2024-01-26 05:47:12 UTC

New

General Dev>In The News

Jan: An open source alternative to ChatGPT that runs on the desktop

Jan | Rethink the Computer. Jan turns your computer into an AI machine by running LLMs locally on your computer. It’s a privacy-focus, l...

jan.ai

#desktop #chatgpt

4 3498 4

2024-03-29 08:42:30 UTC

New

Backend>Learning Resources

MySQL 9 Essentials

A concise guide to MySQL 9 database administration, covering fundamental concepts, techniques, and best practices. Neil Smyth MySQL...

pragprog.com

#pragprog #published-book /mysql /book-mysql-9-essentials

2 2579 0

2025-03-12 13:05:49 UTC

New

AI>In The News

A trillion dollars is a terrible thing to waste

AI>In The News

The chip made for the AI inference era – the Google TPU

AI>In The News

AI CEO – Replace your boss before they replace you

AI>In The News

Fara-7B: An Efficient Agentic Model for Computer Use

AI>In The News

The Current State of the Theory that GPL Propagates to AI Models Trained on GPL Code

AI>In The News

We're Losing Our Voice to LLMs

AI>In The News

Slop Detective

AI>In The News

Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos

AI>In The News

MIT study finds AI can replace 11.7% of U.S. workforce

AI>In The News

Google Antigravity Exfiltrates Data

AI>In The News

AI In The News ❯

Latest on Devtalk

What’s new in Svelte: December 2025

Frontend>Official News

webR - R in the browser

General Dev>In The News

Zero knowlege proof of compositeness

General Dev>In The News

Be Like Clippy

General Dev>In The News

All it takes is for one to work out

General Dev>In The News

Had a game idea - what do you think?

Game Dev>Chat

The weirdest tool I own is also one of the most useful (and it's $14 on Amazon)

General Dev>In The News

A first look at Django's new background tasks

Backend>In The News

Introducing the New Runbook Execution Engine

General Dev>In The News

How to use Linux vsock for fast VM communication

Linux>In The News

OS Malevich — how we made a system that embodies the idea of absolute simplicity

General Dev>In The News

A trillion dollars is a terrible thing to waste

AI>In The News

Petition to recognise open source work as civic service in Germany

General Dev>In The News

Swedish publishers file police report against Meta's Zuckerberg for fraud

General Dev>In The News

Advent of Code 2025: A Kotlin Playground

Backend>Official News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Reinforcement Pre-Training

CommunityNews

Reinforcement Pre-Training

Where Next?

Popular Ai topics

AI Teaches Itself Diplomacy

Unveiling our new Quantum AI campus

DeepMind’s AI helps untangle the mathematics of knots

Nvidia R&D chief on how AI is improving chip design

Can You Distinguish Daniel Dennett from a Computer?

Hyundai announces $400M AI, robotics institute powered by Boston Dynamics

OpenAI debuts DALL-E API so devs can integrate its AI artwork into their apps

Klarna CEO says the company stopped hiring a year ago because AI 'can already do all of the jobs'

Developer survey shows trust in AI coding tools is falling as usage rises

LLM Leaderboard - Comparison of over 100 AI models from OpenAI, Google, DeepSeek & others | Artificial Analysis

Other popular topics

HELLO WORLD (Introductions thread!)

Would you use Erlang now when there is Elixir?

Poll: Which keyboard layout do you use?

GMK Serika Keycaps - Serika 2 available to order now!

Can AI/ML predict a lottery win?

Clipboard readtext not working in android webview

A Common-Sense Guide to Data Structures and Algorithms in Python, Volume 1

X can’t stop spread of explicit, fake AI Taylor Swift images

Jan: An open source alternative to ChatGPT that runs on the desktop

MySQL 9 Essentials

Sponsor Spotlight

AI>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta