CommunityNews

CommunityNews

The Mathematics of Training LLMs

The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI.
Listen now | Breaking down the viral Transformers Math 101 article and high performance distributed training for Transformers-based architectures (or “How I Learned to Stop Handwaving and Make the GPU go brrrrrr”)

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

New
First poster: dpritchett
It’s not what programming languages do, it’s what they shepherd you to. How many of you have listened, read or taken part in a discussio...
New
First poster: bot
MEMORANDUM FOR SENIOR PENTAGON LEADERSHIP COMMANDANT OF THE COAST GUARD COMMANDERS OF THE COMBATANT COMMANDS DEFENSE AGENCY AND DOD FIEL...
New
First poster: bot
GitHub - lucidrains/PaLM-rlhf-pytorch: Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectur...
New
First poster: bot
openai-python/chatml.md at main · openai/openai-python. The OpenAI Python library provides convenient access to the OpenAI API from appl...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
CommunityNews
9 fintech engineering mistakes. Read this list unless you want to build a money dissappearing system
New
First poster: jkdiaz
Dark mode isn’t as good for your eyes as you believe. The shadowy display mode has leagues of fans claiming it helps reduce eye strain, ...
New
CommunityNews
The French originated the meter in the 1790s as one/ten-millionth of the distance from the equator to the north pole along a meridian thr...
New
First poster: andrea
Most of what modern software engineers do involves APIs: public interfaces for communicating with a program, like this one from Twilio. I...
New

Other popular topics Top

Devtalk
Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...
1063 23050 405
New
PragmaticBookshelf
Machine learning can be intimidating, with its reliance on math and algorithms that most programmers don't encounter in their regular wor...
New
DevotionGeo
I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
Thanks to @foxtrottwist’s and @Tomas’s posts in this thread: Poll: Which code editor do you use? I bought Onivim! :nerd_face: https://on...
New
rustkas
Intensively researching Erlang books and additional resources on it, I have found that the topic of using Regular Expressions is either c...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
PragmaticBookshelf
Author Spotlight Rebecca Skinner @RebeccaSkinner Welcome to our latest author spotlight, where we sit down with Rebecca Skinner, auth...
New
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New