CommunityNews

CommunityNews

Defeating Nondeterminism in LLM Inference

Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results. This by itself is not surprising, since getting a result from a language model involves “sampling”, a process that converts the language model’s output into a probability distribution and probabilistically selects a token.
What might be more surprising is that even when we adjust the temperature down to 0This means that the LLM always chooses the highest probability token, which is called greedy sampling. (thus making the sampling theoretically deterministic), LLM APIs are still not deterministic in practice (see past discussions here, here, or here). Even when running inference on your own hardware with an OSS inference library like vLLM or SGLang, sampling still isn’t deterministic (see here or here).

Read in full here:

Where Next?

Popular Ai topics Top

New
AstonJ
Well done DeepMind… wonder what else they’re working on… One of biology’s biggest mysteries has been solved using artificial intelligen...
New
First poster: CommunityNews
In their decades-long chase to create artificial intelligence, computer scientists have designed and developed all kinds of complicated m...
New
First poster: bot
DeepMind AI predicts incoming rainfall with high accuracy. Having flexed its muscles in predicting kidney injury, toppling Go champions ...
New
First poster: bot
Language technology powered by AI can perpetuate bias if we are not careful. We need to be sure that language AI is trained to be ethical...
New
First poster: bot
Upcoming “Hopper” GPU broke records in its MLPerf debut, according to Nvidia.
New
New
First poster: bot
Exascale Cerebras Andromeda cluster packs more cores than 1,954 Nvidia A100 GPUs.
New
CommunityNews
AI supercomputer will use “tens of thousands” of Nvidia A100 and H100 GPUs.
New
CommunityNews
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language...
New

Other popular topics Top

ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
axelson
I’ve been really enjoying obsidian.md: It is very snappy (even though it is based on Electron). I love that it is all local by defaul...
New
AstonJ
Or looking forward to? :nerd_face:
483 11975 256
New
dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
AstonJ
Curious to know which languages and frameworks you’re all thinking about learning next :upside_down_face: Perhaps if there’s enough peop...
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
mafinar
This is going to be a long an frequently posted thread. While talking to a friend of mine who has taken data structure and algorithm cou...
New
PragmaticBookshelf
Author Spotlight: Karl Stolley @karlstolley Logic! Rhetoric! Prag! Wow, what a combination. In this spotlight, we sit down with Karl ...
New
RobertRichards
Hair Salon Games for Girls Fun Girls Hair Saloon game is mainly developed for kids. This game allows users to select virtual avatars to ...
New