CommunityNews

CommunityNews

The Matrix: A Bayesian learning model for LLMs

The Matrix: A Bayesian learning model for LLMs.
In this paper, we introduce a Bayesian learning model to understand the behavior of Large Language Models (LLMs). We explore the optimization metric of LLMs, which is based on predicting the next token, and develop a novel model grounded in this principle. Our approach involves constructing an ideal generative text model represented by a multinomial transition probability matrix with a prior, and we examine how LLMs approximate this matrix. We discuss the continuity of the mapping between embeddings and multinomial distributions, and present the Dirichlet approximation theorem to approximate any prior. Additionally, we demonstrate how text generation by LLMs aligns with Bayesian learning principles and delve into the implications for in-context learning, specifically explaining why in-context learning emerges in larger models where prompts are considered as samples to be updated. Our findings indicate that the behavior of LLMs is consistent with Bayesian Learning, offering new insights into their functioning and potential applications.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

AstonJ
Apple co-founder Steve Wozniak is suing YouTube for allegedly allowing scammers to use images and videos of him to defraud people. The s...
New
First poster: dwaynebradley
Maybe it’s just my experience, but Object-Oriented Programming seems like a default, most common paradigm of software engineering. The on...
New
First poster: bot
The overengineered Solution to my Pigeon Problem. TL;DR: I built a wifi-equipped water gun to shoot the pigeons on my balcony, controlle...
New
First poster: bot
Rewrite it in Rust by ridiculousfish · Pull Request #9512 · fish-shell/fish-shell. (Sorry for the meme; also this is obligatory.) I thi...
New
First poster: dani
The pool of talented C++ developers is running dry. Highly sought after, rarely provided.
New
First poster: gulshan212
Why Python keeps growing, explained | The GitHub Blog. A deep dive into why more people are using Python than ever, its key use cases, a...
New
CommunityNews
Apple Patents Suggest Future AirPods Could Monitor Biosignals & Brain Activity - AppleMagazine. The US Patent & Trademark Office...
New
First poster: dyowee
Software engineering job openings hit five-year low?. There are 35% fewer software developer job listings on Indeed today, than five yea...
New
First poster: AstonJ
On the benefits of learning in public. Learning in public helps me grow as an engineer and seems to benefit others too. Here’s why I sho...
New
First poster: alvinkatojr
There are countless articles why developers should not focus on Frameworks too much and instead learn to understand the underlying langua...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
malloryerik
Any thoughts on Svelte? Svelte is a radical new approach to building user interfaces. Whereas traditional frameworks like React and Vue...
New
PragmaticBookshelf
Design and develop sophisticated 2D games that are as much fun to make as they are to play. From particle effects and pathfinding to soci...
New
AstonJ
Just done a fresh install of macOS Big Sur and on installing Erlang I am getting: asdf install erlang 23.1.2 Configure failed. checking ...
New
AstonJ
In case anyone else is wondering why Ruby 3 doesn’t show when you do asdf list-all ruby :man_facepalming: do this first: asdf plugin-upd...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1143 25883 760
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
AstonJ
This is a very quick guide, you just need to: Download LM Studio: https://lmstudio.ai/ Click on search Type DeepSeek, then select the o...
New