CommunityNews

CommunityNews

The Matrix: A Bayesian learning model for LLMs

The Matrix: A Bayesian learning model for LLMs.
In this paper, we introduce a Bayesian learning model to understand the behavior of Large Language Models (LLMs). We explore the optimization metric of LLMs, which is based on predicting the next token, and develop a novel model grounded in this principle. Our approach involves constructing an ideal generative text model represented by a multinomial transition probability matrix with a prior, and we examine how LLMs approximate this matrix. We discuss the continuity of the mapping between embeddings and multinomial distributions, and present the Dirichlet approximation theorem to approximate any prior. Additionally, we demonstrate how text generation by LLMs aligns with Bayesian learning principles and delve into the implications for in-context learning, specifically explaining why in-context learning emerges in larger models where prompts are considered as samples to be updated. Our findings indicate that the behavior of LLMs is consistent with Bayesian Learning, offering new insights into their functioning and potential applications.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: bot
FUZIX FUZIX is a fusion of various elements from the assorted UZI forks and branches beaten together into some kind of semi-coherent pla...
New
First poster: bot
A field guide to help you recognize achievement, spot A field guide to help you recognize achievement, spot bottlenecks, and debug your d...
New
First poster: dwaynebradley
Maybe it’s just my experience, but Object-Oriented Programming seems like a default, most common paradigm of software engineering. The on...
New
New
CommunityNews
ABSTRACT In lieu of a traditional , I’ve tried to distill the essence of the talk into a collection of maxims: All programmers are API ...
New
New
First poster: gulshan212
Why Python keeps growing, explained | The GitHub Blog. A deep dive into why more people are using Python than ever, its key use cases, a...
New
New
First poster: dyowee
Software engineering job openings hit five-year low?. There are 35% fewer software developer job listings on Indeed today, than five yea...
New
First poster: andrea
Most of what modern software engineers do involves APIs: public interfaces for communicating with a program, like this one from Twilio. I...
New

Other popular topics Top

New
PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
AstonJ
We have a thread about the keyboards we have, but what about nice keyboards we come across that we want? If you have seen any that look n...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
PragmaticBookshelf
Use WebRTC to build web applications that stream media and data in real time directly from one user to another, all in the browser. ...
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New