CommunityNews

Outcome-based Reinforcement Learning to Predict the Future

Reinforcement learning with verifiable rewards (RLVR) has boosted math and coding in large language models, yet there has been little effort to extend RLVR into messier, real-world domains like forecasting. One sticking point is that outcome-based reinforcement learning for forecasting must learn from binary, delayed, and noisy rewards, a regime where standard fine-tuning is brittle. We show that outcome-only online RL on a 14B model can match frontier-scale accuracy and surpass it in calibration and hypothetical prediction market betting by adapting two leading algorithms, Group-Relative Policy Optimisation (GRPO) and ReMax, to the forecasting setting. Our adaptations remove per-question variance scaling in GRPO, apply baseline-subtracted advantages in ReMax, hydrate training with 100k temporally consistent synthetic questions, and introduce lightweight guard-rails that penalise gibberish, non-English responses and missing rationales, enabling a single stable pass over 110k events. Scaling ReMax to 110k questions and ensembling seven predictions yields a 14B model that matches frontier baseline o1 on accuracy on our holdout set (Brier = 0.193, p = 0.23) while beating it in calibration (ECE = 0.042, p < 0.001). A simple trading rule turns this calibration edge into $127 of hypothetical profit versus $92 for o1 (p = 0.037). This demonstrates that refined RLVR methods can convert small-scale LLMs into potentially economically valuable forecasting tools, with implications for scaling this to larger models.

Read in full here:

View thread on forum

#learning

0 404 0

2025-05-28 03:18:04 UTC

Where Next?

View thread on forum

learning

Home AI>In The News

#learning

0 404 0

Last post

Popular Ai topics

AI>In The News

Europe seeks to limit use of AI in society

The use of facial recognition for surveillance, or algorithms that manipulate human behaviour, will be banned under proposed EU regulatio...

bbc.co.uk

0 1262 0

2021-04-16 15:16:22 UTC

New

AI>In The News

Google AI tool can help patients identify skin conditions

Google has unveiled a tool that uses artificial intelligence to help spot skin, hair and nail conditions, based on images uploaded by pat...

bbc.co.uk

#google

0 1339 0

2021-05-20 19:24:41 UTC

New

AI>In The News

The Evolution of AI in the USA, 1956-1996

BROKEN PROMISES & EMPTY THREATS: THE EVOLUTION OF AI IN THE USA, 1956-1996 Artificial Intelligence (AI) is once again a promising tec...

technologystories.org

0 1612 0

2021-12-06 23:09:27 UTC

New

AI>In The News

Artificial Intelligence and Machine Learning– Explained

Steve Blank Artificial Intelligence and Machine Learning– Explained. Artificial Intelligence is a once-in-a lifetime commercial and defe...

steveblank.com

#basics

0 1375 0

2022-05-20 02:17:36 UTC

New

AI>In The News

How to fix the eyes in AI-generated images

aidemos.info

0 4508 0

2022-09-10 13:54:33 UTC

New

AI>In The News

Fake Joe Rogan interviews fake Steve Jobs in an AI-powered podcast

Voice synthesis PR stunt calls upon the dead to help sell an AI product.

arstechnica.com

#jobs

2 913 3

2023-01-10 21:50:47 UTC

New

AI>In The News

Adobe plays catch-up with Project Blink, an AI-powered video editor

AI video editor can recognize objects, people, and sounds, allowing editing via text.

arstechnica.com

#project #video #adobe

0 1271 0

2022-10-20 22:31:08 UTC

New

AI>In The News

Why AI is still dumb and not scary at all (pt.1)

How I Learned to Stop Worrying and Love the AI

tejo.substack.com

15 722 9

2025-05-05 21:52:16 UTC

New

AI>In The News

These psychological tricks can get LLMs to respond to “forbidden” prompts

Study shows how patterns in LLM training data can lead to “parahuman” responses.

arstechnica.com

0 782 0

2025-09-04 01:54:14 UTC

New

AI>In The News

Claude-code - native LSP support

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing rout...

github.com

#code #changelog #claude

0 1 0

2025-12-23 13:53:12 UTC

New

Other popular topics

General Dev>Dev Chat

What dev-related stuff have you been up to?

Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...

#community

1063 23582 394

2026-07-13 02:04:09 UTC

New

Backend>Learning Resources

Web Development with Clojure, Third Edition

Stop developing web apps with yesterday’s tools. Today, developers are increasingly adopting Clojure as a web-development platform. See f...

pragprog.com

#pragprog #web-development /clojure #published-book /book-web-development-with-clojure-third-edition

5 4584 1

2022-01-06 05:27:09 UTC

New

Game Dev>Learning Resources

The Ray Tracer Challenge

Brace yourself for a fun challenge: build a photorealistic 3D renderer from scratch! In just a couple of weeks, build a ray tracer that r...

pragprog.com

#pragprog #published-book /book-the-ray-tracer-challenge #algorithms

3 6115 0

2020-09-22 14:26:56 UTC

New

General Dev>Learning Resources

Seven More Languages in Seven Weeks

Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...

pragprog.com

#pragprog /elixir /julia /lua #published-book #factor /elm #minikanren /idris /book-seven-more-languages-in-seven-weeks

4 5862 0

2020-04-29 21:59:54 UTC

New

General Dev>Hardware

Planck vs Preonic vs Subatomic (Keyboards)

I ended up cancelling my Moonlander order as I think it’s just going to be a bit too bulky for me. I think the Planck and the Preonic (o...

/keyboards #mechanical-keyboards #ortholinear #planck #preonic

105 17596 47

2021-05-28 21:32:35 UTC

New

Frontend>Learning Resources

Modern CSS with Tailwind

Tailwind CSS is an exciting new CSS framework that allows you to design your site by composing simple utility classes to create complex e...

pragprog.com

#pragprog /tailwind #published-book /book-modern-css-with-tailwind

12 5813 4

2021-05-13 14:50:23 UTC

New

Backend>Learning Resources

Programming Phoenix LiveView

Build highly interactive applications without ever leaving Elixir, the way the experts do. Let LiveView take care of performance, scalabi...

pragprog.com

#pragprog /elixir /phoenix #published-book /book-programming-phoenix-liveview

84 14528 26

2026-07-17 13:20:20 UTC

New

Backend>Learning Resources

Agile Web Development with Rails 8

Get the comprehensive, insider information you need for Rails 8 with the new edition of this award-winning classic. Sam Ruby @rubys ...

pragprog.com

#pragprog #web-development /ruby /rails #published-book /book-agile-web-development-with-rails-8

12 7404 7

2025-03-27 18:33:39 UTC

New

AI>Chat

Post your DeepSeek results

Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...

/deepseek

15 4275 15

2025-03-06 23:29:12 UTC

New

General Dev>Reviews

Keyboard Review: UHK60V2 vs Defy vs Voyager vs Glove80 vs Svalboard

Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...

/keyboards #uhk60v2 #defy #voyager #glove80 #svalboard

5 5681 7

2025-04-21 21:44:45 UTC

New

AI>In The News

Claude Code: Anatomy of a Misfeature

AI>In The News

Kimi K3 - Intelligence, Performance & Price Analysis

AI>In The News

Introducing LM Studio Bionic: the AI agent for open models

AI>In The News

Grok Build is open source

AI>In The News

The Agentic Loop: Three loops in a trench coat

AI>In The News

How OpenAI Plans To Win Over Doctors, Patients And Hospitals

AI>In The News

Google revamps image search for its 25th anniversary with more images and more AI

AI>In The News

Zig Creator Calls Spade a Spade, Anthropic Blows Smoke

AI>In The News

Old and new apps, via modern coding agents

AI>In The News

AI 2040 and the Cult of Intelligence

AI>In The News

AI In The News ❯

Latest on Devtalk

Fable 5.11.0 released!

Frontend>Official News

Claude Code: Anatomy of a Misfeature

AI>In The News

Linus Torvalds to critics of AI coding in Linux: "Fork it. Or just walk away."

Linux>In The News

It's official: EU will force Google to share search data and open up AI on Android

Android>In The News

Kimi K3 - Intelligence, Performance & Price Analysis

AI>In The News

Introducing LM Studio Bionic: the AI agent for open models

AI>In The News

Fable 5.10.0 released!

Frontend>Official News

Crystal 1.21.0 released!

Backend>Official News

Space Datacenters - if you see a datacenter in orbit, it means something went wrong on Earth

General Dev>In The News

Distinguishing variables from parameters

General Dev>In The News

"One Hot Node"

General Dev>In The News

Salary information to be shown on job ads under new laws

General Dev>In The News

Digital Bandung

General Dev>In The News

Grok Build is open source

AI>In The News

OpenAI blames email mixup for why it didn't respond to Apple trade theft claims

macOS>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Outcome-based Reinforcement Learning to Predict the Future

CommunityNews

Outcome-based Reinforcement Learning to Predict the Future

Where Next?

Popular Ai topics

Europe seeks to limit use of AI in society

Google AI tool can help patients identify skin conditions

The Evolution of AI in the USA, 1956-1996

Artificial Intelligence and Machine Learning– Explained

How to fix the eyes in AI-generated images

Fake Joe Rogan interviews fake Steve Jobs in an AI-powered podcast

Adobe plays catch-up with Project Blink, an AI-powered video editor

Why AI is still dumb and not scary at all (pt.1)

These psychological tricks can get LLMs to respond to “forbidden” prompts

Claude-code - native LSP support

Other popular topics

What dev-related stuff have you been up to?

Web Development with Clojure, Third Edition

The Ray Tracer Challenge

Seven More Languages in Seven Weeks

Planck vs Preonic vs Subatomic (Keyboards)

Modern CSS with Tailwind

Programming Phoenix LiveView

Agile Web Development with Rails 8

Post your DeepSeek results

Keyboard Review: UHK60V2 vs Defy vs Voyager vs Glove80 vs Svalboard

Sponsor Spotlight

AI>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta