CommunityNews

Something weird is happening with LLMs and chess

Something weird is happening with LLMs and chess.
Are they good or bad?

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

4 comments

#chess

4 268 4

2024-11-17 21:21:49 UTC

Most Liked

jmagnani

Been playing around with LLMs, but it feels like writing the right prompt is a trial-and-error thing.

Post #2

Eiji

Doing something using a tool, that doesn’t support it, doesn’t make sense and therefore it’s not worth analysing the results.

e4 e6 2. d3 c5 3. Nf3 Nc6 4. g3 Nf6 5.

with such input data it’s pointless to show first 4 moves on graph. Saying

Wow, recent LLMs can sort of play chess! They fall apart after the early game (…)

is like saying:

While the input data was good, the results were bad

since as we see the LLMs were losing around said 4 initial moves.

Since OpenAI is lame and doesn’t support full grammars, for the closed (OpenAI) models I tried generating up to 10 times and if it still couldn’t come up with a legal move, I just chose one randomly.

so …

Because the runner was cripple, his time was randomly chosen from a pool of possible times.

and? How does this adds anything to the discussion if the author have generated part or (possibly even) the whole output?

You are a chess grandmaster.
(…)
1. e4 e6 2. d3 c5 3. Nf3 Nc6 4. g3 Nf6 5.

It’s just limiting number of possibilities … In linked article there was no mention how the game was rated by the chess engine. There was no information how much blunders and how much mistakes there were. There was no tool used which estimates if said moves looked randomly or if there was actually some plan for the game.

Yet people found that LLMs could play all the way through to the end game, with never-before-seen boards.

Yes, chess engines does the same and better, so? It’s not really hard to write a simple algorithm which filters all moves to the only possible ones, doing a move and checking if the game is over. It’s a small surprise that there was no forced tie, but I guess even the weakest level can avoid it.

The results were not generated from nowhere. There always need to be a source. While asking descriptive questions often helps it may drastically decrease number of possible results. In Google search for example, if you do not force a specific term, the engine is looking for a similar ones and the results may not always be the best.

Also the LLMs prefers mainstream narration for example preference for renewable energy among possible energy sources despite their disadvantages. The most popular LLMs are made by a huge companies and they can support everything including worst things and ideologies as long as it would not be against said companies. The good results were never considered as highest priority.

At start we may be surprised about gpt-3.5-turbo-instruct, but then we notice that gpt-4.o at the start gives a better output, so it’s not “just better than others” - it’s just different. If it’s different (whatever it means) it’s not really worth to compare them. It’s like comparing 2 LLMs where each of them is based on extremely different sources with ideological background and be surprised that they discuss whether the best ideology is Nazism or Stalinism.

and yeah … as always … that’s the powerful “AI” who would take our jobs and destroy humanity. I know chess only for fun and still I’m better than LLM which possibly contains information about thousands of chess plays. The only thing this article has definitely shown is that LLMs are far, far way from becoming an AI.

Post #3

dani

I agree, but let’s see in a couple of more years.

Post #4

Where Next?

View thread on forum

chess

Home General Dev>In The News

#chess

4 268 4

Last post

Popular General Dev topics

General Dev>In The News

Russia wants to ban the use of secure protocols such as TLS 1.3, DoH, DoT, ESNI

Quite scary if you ask me. And it seems China is already blocking TLS 1.3 traffic with their Great Firewall. On the other hand it’s a co...

#internet #encryption #censorship

1 971 1

2020-09-23 19:12:33 UTC

New

General Dev>In The News

Fuzix: A Unix-ish operating system for small machines by Alan Cox

FUZIX FUZIX is a fusion of various elements from the assorted UZI forks and branches beaten together into some kind of semi-coherent pla...

fuzix.org

#unix

0 2180 0

2021-01-04 22:15:21 UTC

New

General Dev>In The News

The faster you unlearn OOP, the better for you and your software

Maybe it’s just my experience, but Object-Oriented Programming seems like a default, most common paradigm of software engineering. The on...

dpc.pw

#oop

36 2275 15

2021-06-21 01:31:51 UTC

New

General Dev>In The News

There’s No Such Thing as Clean Code

Everyone seems to be striving for ‘clean’ code at the moment. You can’t read a blog post without the author telling you how clean their a...

steveonstuff.com

#code

31 1701 9

2022-03-28 00:29:57 UTC

New

General Dev>In The News

Doom-emacs: An Emacs framework

GitHub - hlissner/doom-emacs: An Emacs framework for the stubborn martian hacker. An Emacs framework for the stubborn martian hacker - G...

github.com

/emacs #doom-emacs

55 3346 16

2022-08-11 18:02:08 UTC

New

General Dev>In The News

I made a home security system, powered by a Raspberry Pi 3

Raspberry Pi security alarm — the basics. In November last year — I started building a DIY security alarm system, using a Raspberry Pi a...

blog.cavelab.dev

/security

0 2261 0

2023-01-01 15:50:18 UTC

New

General Dev>In The News

Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy [video]

Let’s build GPT: from scratch, in code, spelled out… We build a Generatively Pretrained Transformer (GPT), following the paper “Attentio...

youtube.com

#video #code

4 1336 2

2023-04-30 20:34:24 UTC

New

General Dev>In The News

Writing Portable Rendering Code with Nvrhi

Writing Portable Rendering Code with NVRHI | NVIDIA Technical Blog. Learn about NVIDIA Rendering Hardware Interface (NVRHI), a library t...

developer.nvidia.com

#code #writing

0 702 0

2024-09-26 17:09:09 UTC

New

General Dev>In The News

I'm a neuroscientist. Here's the surprising truth about TikTok 'brain rot'

What did the study find? They scanned the brains of more than a hundred undergrad students and had them complete a questionnaire about th...

sciencefocus.com

#tiktok

1 463 0

2025-03-10 18:02:25 UTC

New

General Dev>In The News

The Meter, Golden Ratio, Pyramids, and Cubits, Oh My

The French originated the meter in the 1790s as one/ten-millionth of the distance from the equator to the north pole along a meridian thr...

iforgeiron.com

0 592 0

2025-03-12 16:36:27 UTC

New

Other popular topics

General Dev>Dev Chat

Which vertical monitor do you use?

I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...

#monitors #programming

51 4892 20

2023-06-28 07:23:42 UTC

New

General Dev>Hardware

Poll: Which keyboard layout do you use?

poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...

colemakmods.github.io

#polls /keyboards

10 6048 11

2020-10-31 23:12:33 UTC

New

General Dev>Dev Chat

How fast do you type? Check your WPM here!

Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...

typing-speed-test.aoeu.eu

/keyboards

82 7682 31

2021-07-10 05:52:20 UTC

New

Backend>Chat

How to install Ruby 3 with ASDF

In case anyone else is wondering why Ruby 3 doesn’t show when you do asdf list-all ruby :man_facepalming: do this first: asdf plugin-upd...

/ruby #asdf

11 5961 4

2021-02-02 08:02:13 UTC

New

Backend>Learning Resources

Programming Phoenix LiveView

Build highly interactive applications without ever leaving Elixir, the way the experts do. Let LiveView take care of performance, scalabi...

pragprog.com

#pragprog /elixir /phoenix #published-book /book-programming-phoenix-liveview

84 14528 26

2026-07-17 13:20:20 UTC

New

Backend>Chat

Data Structures and Algorithms with Elixir

This is going to be a long an frequently posted thread. While talking to a friend of mine who has taken data structure and algorithm cou...

/elixir #algorithms #data-structures

108 11869 31

2024-11-14 02:14:00 UTC

New

Community>In The Spotlight

Spotlight: Mike Riley (Author) Interview and AMA!

Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...

#author-spotlight /python #iot /book-portable-python-projects #internet-of-things

62 7035 19

2022-06-09 14:01:01 UTC

New

AI>In The News

How to fix the eyes in AI-generated images

aidemos.info

0 4508 0

2022-09-10 13:54:33 UTC

New

Game Dev>Questions

I want to learn how make a game, but where should I start?

I’m able to do the “artistic” part of game-development; character designing/modeling, music, environment modeling, etc. However, I don’t...

#game-dev

15 4965 9

2025-10-18 13:12:58 UTC

New

General Dev>Learning Resources

Write Better with Vale

Lint your docs like code: turn any style guide into enforceable rules with Vale and publish clear, consistent content every time. ...

pragprog.com

#pragprog #published-book #vale /book-write-better-with-vale

8 3707 4

2025-09-24 09:50:55 UTC

New

General Dev>In The News

Introducing Ghost Cut - or why Cut & Paste is broken everywhere

General Dev>In The News

The Top-Down Bet Needs A Bottom-Up Audit

General Dev>In The News

The Airwaves Are Going on Sale Again. But Does the FCC Have the Right Goal? | The Daily Economy

General Dev>In The News

Free Ink · An open ecosystem for e-readers

General Dev>In The News

'VPNs are lawful technical tools,' says EU Court in landmark Anne Frank copyright ruling

General Dev>In The News

The ACLU Is Arming Lawyers to Expose State Surveillance Secrets

General Dev>In The News

What is the Semantic Layer?

General Dev>In The News

Lossless Model Compression Experiment

General Dev>In The News

SleeperGem: RubyGems supply chain attack targets dormant maintainer accounts

General Dev>In The News

Qalculate time hacks

General Dev>In The News

General Dev In The News ❯

Latest on Devtalk

AI Coding will Prevent Expertise | Lars Faye

AI>In The News

ADTs (Algebraic Data Types) in Java

Backend>In The News

What I Learned Comparing Seedance 2.0 API Providers as an Indie Developer

AI>Chat

Deno v2.9.4 released!

Frontend>Official News

Introducing Ghost Cut - or why Cut & Paste is broken everywhere

General Dev>In The News

Does creatine make you smarter?

Science/Tech>Health & Diet

CrucibleBench - Old Worlds for New Agents

AI>In The News

Architecting Accessibility (Manning)

Frontend>Learning Resources

Quarkus 3.37.4 released!

Backend>Official News

Fable 5.12.0 released!

Frontend>Official News

The 4-Hour-Work-Week is over

AI>In The News

On Strings in Rust

Backend>In The News

Inside Character.ai: The Technical Story of What Keeps Users Hooked · manish.sh

AI>In The News

How would you model a fixed 78-item content domain without duplicating data?

Frontend>Questions

The Top-Down Bet Needs A Bottom-Up Audit

General Dev>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Something weird is happening with LLMs and chess

CommunityNews

Something weird is happening with LLMs and chess

Most Liked

jmagnani

Eiji

dani

Where Next?

Popular General Dev topics

Russia wants to ban the use of secure protocols such as TLS 1.3, DoH, DoT, ESNI

Fuzix: A Unix-ish operating system for small machines by Alan Cox

The faster you unlearn OOP, the better for you and your software

There’s No Such Thing as Clean Code

Doom-emacs: An Emacs framework

I made a home security system, powered by a Raspberry Pi 3

Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy [video]

Writing Portable Rendering Code with Nvrhi

I'm a neuroscientist. Here's the surprising truth about TikTok 'brain rot'

The Meter, Golden Ratio, Pyramids, and Cubits, Oh My

Other popular topics

Which vertical monitor do you use?

Poll: Which keyboard layout do you use?

How fast do you type? Check your WPM here!

How to install Ruby 3 with ASDF

Programming Phoenix LiveView

Data Structures and Algorithms with Elixir

Spotlight: Mike Riley (Author) Interview and AMA!

How to fix the eyes in AI-generated images

I want to learn how make a game, but where should I start?

Write Better with Vale

Sponsor Spotlight

General Dev>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta