ManningBooks

Devtalk Sponsor

Evaluation and Alignment, The Seminal Papers (Manning)

Erlang and OTP in Action teaches you the concepts of concurrent programming and the use of Evaluation and Alignment: The Seminal Papers teaches you to think of evaluation as a design constraint. You’ll employ a “working backwards” methodology that begins with what your system must get right, which directs you to the appropriate evaluation approach. As you internalize the define > evaluate > analysis > align cycle, you’ll start making more informed tradeoffs, and expertly balancing helpfulness, safety, and brand voice in your models.

Hanchung Lee

Evaluation and Alignment: The Seminal Papers brings together a set of influential research papers and connects them to day-to-day engineering work. The focus isn’t just on metrics in isolation, but on how evaluation shapes the system you end up building.

The book traces how evaluation has evolved. It starts with straightforward approaches like text matching, moves through semantic similarity, and reaches more recent methods where models are used to judge other models. Seeing that progression helps explain why certain techniques break down and where newer ones fit.

One idea that runs throughout the book is treating evaluation as a design constraint. Instead of measuring quality after the fact, you begin by defining what the system must get right. That choice influences everything else—what data you collect, which metrics you use, and how you interpret results.

There’s also a strong emphasis on closing the loop. Evaluation feeds analysis, which leads to changes in prompts, data, or architecture. Those changes get tested again. Over time, this cycle becomes part of how you build and maintain AI systems, not something you bolt on at the end.

Some of the topics covered along the way:

choosing evaluation methods that match the behavior you care about
spotting failure modes that simple metrics tend to miss
working with subjective qualities like helpfulness, safety, and tone
using evaluation results to guide alignment decisions

If you’ve worked on LLM-based systems, you’ve probably run into the gap between a model that “looks good in a demo” and one that holds up in production. This book is aimed squarely at that gap.

Full details: Evaluation and Alignment, The Seminal Papers - Hanchung Lee

Don’t forget you can get 45% off with your Devtalk discount! Just use the coupon code “devtalk.com” at checkout

View thread on forum

#manning #published-book #mlops #llm-evaluation #responsible-ai #generative-ai #rlhf #rag-evaluation #ai-alignement #llmasjudge #ai-engineering #ai-safety

0 1 0

2026-03-18 13:19:35 UTC

Where Next?

View thread on forum

manning

published-book

mlops

llm-evaluation

responsible-ai

generative-ai

rlhf

rag-evaluation

ai-alignement

llmasjudge

ai-engineering

ai-safety

Home AI>Learning Resources

#manning #published-book #mlops #llm-evaluation #responsible-ai #generative-ai #rlhf #rag-evaluation #ai-alignement #llmasjudge #ai-engineering #ai-safety

0 1 0

Last post

Popular Ai topics

AI>Learning Resources

Learn AI Data Engineering in a Month of Lunches

Learn AI Data Engineering in a Month of Lunches is a fast, friendly guide to integrating large language models into your data workflows. ...

manning.com

#ai #manning #published-book #machine-learning #openai #mlops #data-engineering #llms #ai-for-data #prompt-engineering #generative-ai #book-learn-ai-data-engineering-in-a-month-of-lunches #ai-data-pipelines

6 1261 3

2025-09-25 18:04:03 UTC

New

AI>Learning Resources

Build an AI Agent (From Scratch)

Build an AI Agent (From Scratch) is a step-by-step guide to creating a working AI agent, starting with the bare essentials and growing yo...

manning.com

#ai #llm #mcp #rag #llms #ai-agents

1 631 0

2025-10-09 10:13:00 UTC

New

AI>Learning Resources

Deep Learning with Python, Third Edition

The bestselling book on Python deep learning, now covering generative AI, Keras 3, PyTorch, and JAX! François Chollet and Matthew ...

manning.com

#ai #manning /python #published-book #machine-learning #tensorflow #deep-learning #keras #francois-chollet #pytorch

5 474 3

2025-10-31 04:42:13 UTC

New

AI>Learning Resources

Build a DeepSeek Model (From Scratch)

In Build a DeepSeek Model (From Scratch) you’ll build your own DeepSeek clone from the ground up. First, you’ll quickly review LLM fundam...

manning.com

#published-book

6 617 3

2025-11-28 21:20:19 UTC

New

AI>Learning Resources

AI Governance

AI Governance: Secure, privacy-preserving, ethical systems presents a structured playbook for safely harnessing the potential of Generati...

manning.com

#published-book

0 186 0

2025-11-10 14:28:19 UTC

New

AI>Learning Resources

AI Agents in Action, Second Edition

AI agent technology is changing fast! This totally revised Second Edition of AI Agents in Action by Micheal Lanham guides you through the...

manning.com

#ai #manning #published-book #rag #generative-ai #ai-agents #agentic-ai

0 379 3

2025-11-24 09:38:44 UTC

New

AI>Learning Resources

Build AI-Enhanced Web Apps

Build AI-Enhanced Web Apps guides you through AI development using only JavaScript and other common web dev skills–no Python or Machine L...

manning.com

#ai #manning #published-book #webdev #webapps #llms

1 39 0

2026-03-03 15:12:04 UTC

New

AI>Learning Resources

Retrieval Augmented Generation, The Seminal Papers

Retrieval Augmented Generation, The Seminal Papers explores 12 foundational research papers that explain why RAG works, how it’s built, a...

manning.com

#manning #published-book #llm #rag #retrieval-augmented-generation

0 1 0

2026-03-06 12:57:26 UTC

New

AI>Learning Resources

Machines that Think

AI tools like ChatGPT, Claude Code, and OpenClaw produce impressive results that can be shockingly human-like. But are they really thinki...

manning.com

#ai #manning #published-book #llms

0 55 0

2026-05-11 14:08:56 UTC

New

AI>Learning Resources

Architecting for Autonomy

What changes when AI stops being just a tool you call, and starts becoming part of the way work is planned, delegated, monitored, and exe...

manning.com

#manning #published-book #agentic-ai #enterprise-architecture #enterprise-ai #autonomous-agent #autonomous-model

0 56 1

2026-07-16 08:00:07 UTC

New

Other popular topics

Backend>Learning Resources

Seven Languages in Seven Weeks

Ruby, Io, Prolog, Scala, Erlang, Clojure, Haskell. With Seven Languages in Seven Weeks, by Bruce A. Tate, you’ll go beyond the syntax—and...

pragprog.com

#pragprog /clojure /erlang /haskell /prolog /ruby /scala #published-book /book-seven-languages-in-seven-weeks

5 5730 1

2022-01-20 13:48:55 UTC

New

Backend>Chat

What is the reason behind Rust’s web framework, Rocket, not performing as well as expected in the Techempower benchmarks?

I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...

#web-frameworks /rust

36 7463 11

2020-06-21 10:50:02 UTC

New

General Dev>Dev Chat

PragProg’s Medium Posts

Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.

#pragprog #blog-post

1147 29994 760

2025-07-10 13:36:16 UTC

New

General Dev>Code Editors

Doom-Emacs: Can't find emacs in your PATH

If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...

#macos /emacs #doom-emacs

4 5837 0

2022-02-04 00:32:03 UTC

New

Backend>Learning Resources

Effective Haskell

Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...

pragprog.com

#pragprog /haskell #published-book /book-effective-haskell

15 10218 1

2022-02-16 10:09:51 UTC

New

Community>In The Spotlight

Spotlight: Peter Ullrich (Author) Interview and AMA!

Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...

/elixir /phoenix /book-building-table-views-with-phoenix-liveview

72 4765 21

2023-10-17 17:07:59 UTC

New

Game Dev>Questions

I want to learn how make a game, but where should I start?

I’m able to do the “artistic” part of game-development; character designing/modeling, music, environment modeling, etc. However, I don’t...

#game-dev

15 4965 9

2025-10-18 13:12:58 UTC

New

Backend>Questions

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...

#macos /rails /postgresql

1 5553 1

2024-10-17 02:03:48 UTC

New

Backend>Learning Resources

The New and Improved Flask Mega-Tutorial

Overarching tutorial for Python beginner and intermediate developers that teaches web development with the Flask framework. Miguel Gr...

blog.miguelgrinberg.com

#pragprog /python /flask #published-book /book-the-new-and-improved-flask-mega-tutorial

1 3553 0

2025-02-05 16:06:23 UTC

New

Backend>Learning Resources

MySQL 9 Essentials

A concise guide to MySQL 9 database administration, covering fundamental concepts, techniques, and best practices. Neil Smyth MySQL...

pragprog.com

#pragprog #published-book /mysql /book-mysql-9-essentials

2 4162 0

2025-03-12 13:05:49 UTC

New

AI>Learning Resources

Architecting for Autonomy (Manning)

AI>Learning Resources

LLM Customization and Fine-Tuning (Manning)

AI>Learning Resources

Build Applications with Local AI Models on a Mac (Manning)

AI>Learning Resources

Context Engineering (Manning)

AI>Learning Resources

Crack Any Codebase with AI (Manning)

AI>Learning Resources

Designing AI Agents (Manning)

AI>Learning Resources

Building LLM Applications with DSPy (Manning)

AI>Learning Resources

Building Agentic Applications with CrewAI and MCP (Manning)

AI>Learning Resources

Machines that Think (Manning)

AI>Learning Resources

Quantization and Fast Inference (Manning)

AI>Learning Resources

AI Learning Resources ❯

Latest on Devtalk

Fable 5.11.0 released!

Frontend>Official News

Claude Code: Anatomy of a Misfeature

AI>In The News

Linus Torvalds to critics of AI coding in Linux: "Fork it. Or just walk away."

Linux>In The News

It's official: EU will force Google to share search data and open up AI on Android

Android>In The News

Kimi K3 - Intelligence, Performance & Price Analysis

AI>In The News

Introducing LM Studio Bionic: the AI agent for open models

AI>In The News

Fable 5.10.0 released!

Frontend>Official News

Crystal 1.21.0 released!

Backend>Official News

Space Datacenters - if you see a datacenter in orbit, it means something went wrong on Earth

General Dev>In The News

Distinguishing variables from parameters

General Dev>In The News

"One Hot Node"

General Dev>In The News

Salary information to be shown on job ads under new laws

General Dev>In The News

Digital Bandung

General Dev>In The News

Grok Build is open source

AI>In The News

OpenAI blames email mixup for why it didn't respond to Apple trade theft claims

macOS>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Evaluation and Alignment, The Seminal Papers (Manning)

ManningBooks

Evaluation and Alignment, The Seminal Papers (Manning)

Hanchung Lee

Where Next?

Popular Ai topics

Learn AI Data Engineering in a Month of Lunches

Build an AI Agent (From Scratch)

Deep Learning with Python, Third Edition

Build a DeepSeek Model (From Scratch)

AI Governance

AI Agents in Action, Second Edition

Build AI-Enhanced Web Apps

Retrieval Augmented Generation, The Seminal Papers

Machines that Think

Architecting for Autonomy

Other popular topics

Seven Languages in Seven Weeks

What is the reason behind Rust’s web framework, Rocket, not performing as well as expected in the Techempower benchmarks?

PragProg’s Medium Posts

Doom-Emacs: Can't find emacs in your PATH

Effective Haskell

Spotlight: Peter Ullrich (Author) Interview and AMA!

I want to learn how make a game, but where should I start?

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

The New and Improved Flask Mega-Tutorial

MySQL 9 Essentials

Sponsor Spotlight

AI>Learning Resources

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta