CommunityNews

CommunityNews

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models.
The capabilities and limitations of Large Language Models have been sketched out in great detail in recent years, providing an intriguing yet conflicting picture. On the one hand, LLMs demonstrate a general ability to solve problems. On the other hand, they show surprising reasoning gaps when compared to humans, casting doubt on the robustness of their generalisation strategies. The sheer volume of data used in the design of LLMs has precluded us from applying the method traditionally used to measure generalisation: train-test set separation. To overcome this, we study what kind of generalisation strategies LLMs employ when performing reasoning tasks by investigating the pretraining data they rely on. For two models of different sizes (7B and 35B) and 2.5B of their pretraining tokens, we identify what documents influence the model outputs for three simple mathematical reasoning tasks and contrast this to the data that are influential for answering factual questions. We find that, while the models rely on mostly distinct sets of data for each factual question, a document often has a similar influence across different reasoning questions within the same task, indicating the presence of procedural knowledge. We further find that the answers to factual questions often show up in the most influential data. However, for reasoning questions the answers usually do not show up as highly influential, nor do the answers to the intermediate reasoning steps. When we characterise the top ranked documents for the reasoning questions qualitatively, we confirm that the influential documents often contain procedural knowledge, like demonstrating how to obtain a solution using formulae or code. Our findings indicate that the approach to reasoning the models use is unlike retrieval, and more like a generalisable strategy that synthesises procedural knowledge from documents doing a similar form of reasoning.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: bot
A field guide to help you recognize achievement, spot A field guide to help you recognize achievement, spot bottlenecks, and debug your d...
New
First poster: mafinar
F# Is The Best Coding Language Today. If you want to personally pick up a programming language in order to become a better coder in what...
New
First poster: wolf4earth
It’s official. Your private communications can (and will) be spied on - European Digital Rights (EDRi). On 6 July, the European Parliame...
New
First poster: malloryerik
GitHub - hlissner/doom-emacs: An Emacs framework for the stubborn martian hacker. An Emacs framework for the stubborn martian hacker - G...
New
First poster: bot
Developing Godot Projects with Neovim. When I started using Godot Engine, what surprised me the most is the built-in Language Server Pro...
New
First poster: gulshan212
Why Python keeps growing, explained | The GitHub Blog. A deep dive into why more people are using Python than ever, its key use cases, a...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
First poster: fullstackplus
Why Python is terrible… Nice language, but unsuitable for most professional purposes
New
First poster: AstonJ
Jan | Rethink the Computer. Jan turns your computer into an AI machine by running LLMs locally on your computer. It’s a privacy-focus, l...
New
CommunityNews
Once you get good at Rust all of these problems will go away Rust being great at big refactorings solves a largely self-inflicted issues ...
New

Other popular topics Top

ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
DevotionGeo
I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...
New
axelson
I’ve been really enjoying obsidian.md: It is very snappy (even though it is based on Electron). I love that it is all local by defaul...
New
PragmaticBookshelf
Design and develop sophisticated 2D games that are as much fun to make as they are to play. From particle effects and pathfinding to soci...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
mafinar
Crystal recently reached version 1. I had been following it for awhile but never got to really learn it. Most languages I picked up out o...
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
PragmaticBookshelf
Author Spotlight Jamis Buck @jamis This month, we have the pleasure of spotlighting author Jamis Buck, who has written Mazes for Prog...
New
New
PragmaticBookshelf
Author Spotlight: VM Brasseur @vmbrasseur We have a treat for you today! We turn the spotlight onto Open Source as we sit down with V...
New