CommunityNews

CommunityNews

The Matrix: A Bayesian learning model for LLMs

The Matrix: A Bayesian learning model for LLMs.
In this paper, we introduce a Bayesian learning model to understand the behavior of Large Language Models (LLMs). We explore the optimization metric of LLMs, which is based on predicting the next token, and develop a novel model grounded in this principle. Our approach involves constructing an ideal generative text model represented by a multinomial transition probability matrix with a prior, and we examine how LLMs approximate this matrix. We discuss the continuity of the mapping between embeddings and multinomial distributions, and present the Dirichlet approximation theorem to approximate any prior. Additionally, we demonstrate how text generation by LLMs aligns with Bayesian learning principles and delve into the implications for in-context learning, specifically explaining why in-context learning emerges in larger models where prompts are considered as samples to be updated. Our findings indicate that the behavior of LLMs is consistent with Bayesian Learning, offering new insights into their functioning and potential applications.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Popular General Dev topics Top

First poster: dimitarvp
On Wednesday last week, Google’s Fiona Cicconi wrote to company employees. She announced that Google was bringing forward its timetable ...
New
First poster: AstonJ
In one sense, the Truth Mines were just another indexscape. Hundreds of thousands of specialized selections of the library’s contents wer...
New
New
First poster: Maartz
This Keyboard Lets People Type So Fast It’s Banned From Typing Competitions. A new peripheral lets you keep typing without ever lifting ...
New
First poster: OvermindDL1
You can now buy a 100W USB-C cable with a built-in power meter. They’re just $20 on Amazon, and they work!
New
First poster: bot
Developing Godot Projects with Neovim. When I started using Godot Engine, what surprised me the most is the built-in Language Server Pro...
New
First poster: bot
sqlglot/python_sql_engine.md at main · tobymao/sqlglot. Python SQL Parser and Transpiler. Contribute to tobymao/sqlglot development by c...
New
First poster: bot
Rewrite it in Rust by ridiculousfish · Pull Request #9512 · fish-shell/fish-shell. (Sorry for the meme; also this is obligatory.) I thi...
New
First poster: bot
[js/web] WebGPU backend via JSEP by fs-eire · Pull Request #14579 · microsoft/onnxruntime. Description This change introduced the follo...
New
CommunityNews
The Definitive PHP 7.2, 7.3, 7.4, 8.0, and 8.1 Benchmarks (2023). We tested the performance of 14 PHP platforms (WordPress, Drupal, Lara...
New

Other popular topics Top

AstonJ
Or looking forward to? :nerd_face:
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
New
PragmaticBookshelf
Rust is an exciting new programming language combining the power of C with memory safety, fearless concurrency, and productivity boosters...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
AstonJ
Seems like a lot of people caught it - just wondered whether any of you did? As far as I know I didn’t, but it wouldn’t surprise me if I...
New
gagan7995
API 4 Path: /user/following/ Method: GET Description: Returns the list of all names of people whom the user follows Response [ { ...
New
PragmaticBookshelf
Author Spotlight Erin Dees @undees Welcome to our new author spotlight! We had the pleasure of chatting with Erin Dees, co-author of ...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New