ManningBooks

ManningBooks

Devtalk Sponsor

The RLHF Book (Manning)

After ChatGPT used RLHF to become production-ready, this foundational technique exploded in popularity. In The RLHF Book, AI expert Nathan Lambert gives a true industry insider’s perspective on modern RLHF training pipelines, and their trade-offs. Using hands-on experiments and mini-implementations, Nathan clearly and concisely introduces the alignment techniques that can transform a generic base model into a human-friendly tool.

Nathan Lambert

If you’ve been following RLHF over the last couple of years — from “how does this even work?” to “why is every model suddenly using it?” — this book does a great job of cutting through the noise. Nathan mixes the math and engineering with the bigger questions around alignment, and he does it in a way that doesn’t feel hand-wavy or mystical. It’s practical, grounded, and surprisingly candid about what actually happens inside modern training pipelines.

Here’s the kind of ground the book covers: how human preference data is collected (and how messy that can get), how policy-gradient methods in RLHF really work, where approaches like DPO and direct alignment fit in, and how RLHF evolved toward things like verifiable rewards. Nathan also shares a bunch of behind-the-scenes stories from building open models like Llama-Instruct, Zephyr, Olmo, and Tülu — the kind of details you don’t usually get unless you’re in the room when the training scripts are being rewritten at 2 a.m.

The book also takes time with the things people often gloss over: evaluation, alignment trade-offs, instruction tuning recipes, and all the practical tricks used in industry to make models feel more human, less brittle, and more predictable. It’s the first time I’ve seen all of this explained cleanly in one place.

If you’re working with LLMs — or planning to — and want a deeper understanding of what actually happens after base model pretraining, this one is worth a look.


Don’t forget you can get 45% off with your Devtalk discount! Just use the coupon code “devtalk.com” at checkout :+1:

Where Next?

Popular Ai topics Top

New
ManningBooks
Build an AI Agent (From Scratch) is a step-by-step guide to creating a working AI agent, starting with the bare essentials and growing yo...
New
ManningBooks
Grokking AI Algorithms, Second Edition introduces the most important AI algorithms using relatable illustrations, interesting examples, a...
New
ManningBooks
The bestselling book on Python deep learning, now covering generative AI, Keras 3, PyTorch, and JAX! François Chollet and Matthew ...
New
pragdave
Build robust LLM-powered apps, chatbots, and agents while mastering AI engineering principles that will help you outlast the tools and th...
New
ManningBooks
Erlang and OTP in Action teaches you the concepts of concurrent programming and the use of Erlang’s message-passing model. It walks you t...
New
ManningBooks
In Build a DeepSeek Model (From Scratch) you’ll build your own DeepSeek clone from the ground up. First, you’ll quickly review LLM fundam...
New
ManningBooks
AI Governance: Secure, privacy-preserving, ethical systems presents a structured playbook for safely harnessing the potential of Generati...
New
ManningBooks
AI agent technology is changing fast! This totally revised Second Edition of AI Agents in Action by Micheal Lanham guides you through the...
New
pragdave
Build a prototype in a weekend or a full product in a month or two. Untangle legacy systems, improve tests and documentation, and tackle ...
New

Other popular topics Top

AstonJ
A thread that every forum needs! Simply post a link to a track on YouTube (or SoundCloud or Vimeo amongst others!) on a separate line an...
New
New
ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
dimitarvp
Small essay with thoughts on macOS vs. Linux: I know @Exadra37 is just waiting around the corner to scream at me “I TOLD YOU SO!!!” but I...
New
PragmaticBookshelf
Build highly interactive applications without ever leaving Elixir, the way the experts do. Let LiveView take care of performance, scalabi...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
PragmaticBookshelf
Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New