CommunityNews

CommunityNews

Defeating Nondeterminism in LLM Inference

Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results. This by itself is not surprising, since getting a result from a language model involves “sampling”, a process that converts the language model’s output into a probability distribution and probabilistically selects a token.
What might be more surprising is that even when we adjust the temperature down to 0This means that the LLM always chooses the highest probability token, which is called greedy sampling. (thus making the sampling theoretically deterministic), LLM APIs are still not deterministic in practice (see past discussions here, here, or here). Even when running inference on your own hardware with an OSS inference library like vLLM or SGLang, sampling still isn’t deterministic (see here or here).

Read in full here:

Where Next?

Popular Ai topics Top

First poster: CommunityNews
Many recent big advances in tech have one key thing at the heart of then: artificial intelligence.
New
New
First poster: bot
Language technology powered by AI can perpetuate bias if we are not careful. We need to be sure that language AI is trained to be ethical...
New
First poster: CommunityNews
A new computer program fashioned after artificial intelligence systems like AlphaGo has solved several open problems in combinatorics and...
New
First poster: bot
Building games and apps entirely through natural language using OpenAI’s code-davinci model. TL;DR: OpenAI has a new code generating mod...
New
CommunityNews
AI supercomputer will use “tens of thousands” of Nvidia A100 and H100 GPUs.
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New
First poster: AstonJ
I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—he...
New
CommunityNews
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language...
New
CommunityNews
Contracted AI raters describe grueling deadlines, poor pay and opacity around work to make chatbots intelligent
New

Other popular topics Top

New
AstonJ
SpaceVim seems to be gaining in features and popularity and I just wondered how it compares with SpaceMacs in 2020 - anyone have any thou...
New
AstonJ
There’s a whole world of custom keycaps out there that I didn’t know existed! Check out all of our Keycaps threads here: https://forum....
New
AstonJ
Just done a fresh install of macOS Big Sur and on installing Erlang I am getting: asdf install erlang 23.1.2 Configure failed. checking ...
New
AstonJ
I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...
New
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
AstonJ
This is a very quick guide, you just need to: Download LM Studio: https://lmstudio.ai/ Click on search Type DeepSeek, then select the o...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New