CommunityNews

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

In this technical report, we tackle the challenges of training large-scale Mixture of Experts (MoE) models, focusing on overcoming cost inefficiency and resource limitations prevalent in such systems. To address these issues, we present two differently sized MoE large language models (LLMs), namely Ling-Lite and Ling-Plus (referred to as “Bailing” in Chinese, spelled Bǎilíng in Pinyin). Ling-Lite contains 16.8 billion parameters with 2.75 billion activated parameters, while Ling-Plus boasts 290 billion parameters with 28.8 billion activated parameters. Both models exhibit comparable performance to leading industry benchmarks. This report offers actionable insights to improve the efficiency and accessibility of AI development in resource-constrained settings, promoting more scalable and sustainable technologies. Specifically, to reduce training costs for large-scale MoE models, we propose innovative methods for (1) optimization of model architecture and training processes, (2) refinement of training anomaly handling, and (3) enhancement of model evaluation efficiency. Additionally, leveraging high-quality data generated from knowledge graphs, our models demonstrate superior capabilities in tool use compared to other models. Ultimately, our experimental findings demonstrate that a 300B MoE LLM can be effectively trained on lower-performance devices while achieving comparable performance to models of a similar scale, including dense and MoE models. Compared to high-performance devices, utilizing a lower-specification hardware system during the pre-training phase demonstrates significant cost savings, reducing computing costs by approximately 20%. The models can be accessed at inclusionAI (inclusionAI).

Read in full here:

View thread on forum

#flop #llm

0 162 0

2025-03-28 21:38:53 UTC

Where Next?

View thread on forum

flop

llm

Home General Dev>In The News

#flop #llm

0 162 0

Last post

Popular General Dev topics

General Dev>In The News

Are top websites using WebGL for fingerprinting?

Site Fingerprinting google.com Yes youtube.com Yes Amazon.com Yes Yahoo.com Yes Zoom.us No Facebook.com Yes Reddit.com Ye...

jonatron.github.io

#webgl

0 1439 0

2021-04-20 20:29:34 UTC

New

General Dev>In The News

Emacs Typing Tutor

Last night I re-read this Steve Yegge article about learning to type as a programmer. I can touch type, but I don’t usually manage to bre...

connorberry.com

/emacs #typing

0 1099 0

2021-09-22 05:32:49 UTC

New

General Dev>In The News

There’s No Such Thing as Clean Code

Everyone seems to be striving for ‘clean’ code at the moment. You can’t read a blog post without the author telling you how clean their a...

steveonstuff.com

#code

31 1262 9

2022-03-28 00:29:57 UTC

New

General Dev>In The News

LG 28-inch 16:18 DualUp Monitor

LG 28-inch 16:18 DualUp Monitor with Ergo Stand and USB Type-C™ (28MQ780-B) | LG USA. Shop LG 28MQ780-B on the official LG.com website ...

lg.com

12 1824 12

2022-09-01 19:28:37 UTC

New

General Dev>In The News

I made a home security system, powered by a Raspberry Pi 3

Raspberry Pi security alarm — the basics. In November last year — I started building a DIY security alarm system, using a Raspberry Pi a...

blog.cavelab.dev

/security

0 1924 0

2023-01-01 15:50:18 UTC

New

General Dev>In The News

Why Python keeps growing, explained

Why Python keeps growing, explained | The GitHub Blog. A deep dive into why more people are using Python than ever, its key use cases, a...

github.blog

/python

9 974 9

2023-08-19 11:34:00 UTC

New

General Dev>In The News

Why Python is terrible

Why Python is terrible… Nice language, but unsuitable for most professional purposes

josvisser.substack.com

/python

8 824 6

2024-04-06 04:17:41 UTC

New

General Dev>In The News

Software engineering job openings hit five-year low?

Software engineering job openings hit five-year low?. There are 35% fewer software developer job listings on Indeed today, than five yea...

blog.pragmaticengineer.com

#job

5 321 2

2025-02-28 01:20:21 UTC

New

General Dev>In The News

Ladybird: Truly independent web browser

Truly independent web browser. Contribute to LadybirdBrowser/ladybird development by creating an account on GitHub.

github.com

#browser #web #github

4 354 3

2025-03-10 13:45:11 UTC

New

General Dev>In The News

olmOCR – Open-Source OCR for Accurate Document Conversion

olmOCR is an open-source tool for converting PDFs to text with high accuracy, preserving reading order and supporting tables, equations, ...

olmocr.allenai.org

2 412 1

2025-03-09 05:08:33 UTC

New

Other popular topics

General Dev>Dev Chat

Standing Desks

No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:

#workspace #opinions

177 8632 77

2022-09-27 18:40:05 UTC

New

Backend>Questions

Can someone explain the -t option/flag in docker run command?

I know that -t flag is used along with -i flag for getting an interactive shell. But I cannot digest what the man page for docker run com...

#docker

7 7340 2

2020-09-01 07:19:16 UTC

New

Backend>Chat

Would you use Erlang now when there is Elixir?

Why, if your answer is yes?

/elixir /erlang

167 4400 52

2021-04-22 18:15:44 UTC

New

Game Dev>Learning Resources

Hands-on Rust: Effective Learning through 2D Game Development and Play

Rust is an exciting new programming language combining the power of C with memory safety, fearless concurrency, and productivity boosters...

pragprog.com

#pragprog /rust #published-book /book-hands-on-rust

116 8174 31

2025-07-17 16:48:34 UTC

New

Community>Journals

Programming Crystal Book Club

Crystal recently reached version 1. I had been following it for awhile but never got to really learn it. Most languages I picked up out o...

/crystal /book-programming-crystal #book-club

155 4360 65

2021-07-09 11:44:56 UTC

New

Data Science

Can AI/ML predict a lottery win?

Biggest jackpot ever apparently! :upside_down_face: I don’t (usually) gamble/play the lottery, but working on a program to predict the...

#ai #machine-learning

19 3178 10

2021-10-18 19:01:41 UTC

New

Community>In The Spotlight

Spotlight: Mike Riley (Author) Interview and AMA!

Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...

#author-spotlight /python #iot /book-portable-python-projects #internet-of-things

62 6351 19

2022-06-09 14:01:01 UTC

New

Community>In The Spotlight

Spotlight: VM Brasseur (Author) Interview and AMA!

Author Spotlight: VM Brasseur @vmbrasseur We have a treat for you today! We turn the spotlight onto Open Source as we sit down with V...

#author-spotlight /book-forge-your-future-with-open-source

16 4113 11

2023-03-27 16:00:12 UTC

New

Community>In The Spotlight

Spotlight: Tammy Coron (Author) Interview and AMA!

Author Spotlight: Tammy Coron @Paradox927 Gaming, and writing games in particular, is about passion, vision, experience, and immersio...

/swift /book-apple-game-frameworks-and-technologies

36 3521 18

2023-10-16 19:05:34 UTC

New

Community>In The Spotlight

Spotlight: Sophie DeBenedetto (Author) Interview and AMA!

Author Spotlight: Sophie DeBenedetto @SophieDeBenedetto The days of the traditional request-response web application are long gone, b...

/elixir /phoenix #liveview /book-programming-phoenix-liveview

37 3237 14

2023-10-17 17:12:53 UTC

New

General Dev>In The News

Protest footage blocked as online safety act comes into force

General Dev>In The News

Three high-performance RISC-V processors to watch in H2 2025: UltraRISC UR-DP1000, Zhihe A210, and SpacemIT K3 - CNX Software

General Dev>In The News

The future is NOT Self-Hosted, but Self-Sovereign

General Dev>In The News

Inverted Indexes: A Step-by-Step Implementation Guide

General Dev>In The News

How we Rooted Copilot - Eye Research

General Dev>In The News

What went wrong for Yahoo

General Dev>In The News

Heredocs Can Make Your Bash Scripts Self-Documenting | Hold The Robot

General Dev>In The News

From Async/Await to Virtual Threads

General Dev>In The News

The Future is NOT Self-Hosted

General Dev>In The News

Celebrating 20 years of MDN | MDN Blog

General Dev>In The News

General Dev In The News ❯

Latest on Devtalk

Protest footage blocked as online safety act comes into force

General Dev>In The News

Three high-performance RISC-V processors to watch in H2 2025: UltraRISC UR-DP1000, Zhihe A210, and SpacemIT K3 - CNX Software

General Dev>In The News

The future is NOT Self-Hosted, but Self-Sovereign

General Dev>In The News

Inverted Indexes: A Step-by-Step Implementation Guide

General Dev>In The News

How we Rooted Copilot - Eye Research

General Dev>In The News

Debian: DebConf25 closes in Brest and DebConf26 announced

Linux>Official News

Debian: DebConf25 starts today in Brest on Monday, July 14, 2025

Linux>Official News

Debian: Debconf25 welcomes its sponsors

Linux>Official News

What went wrong for Yahoo

General Dev>In The News

CentOS Board Meeting Recap, June 2025

Linux>Official News

Heredocs Can Make Your Bash Scripts Self-Documenting | Hold The Robot

General Dev>In The News

From Async/Await to Virtual Threads

General Dev>In The News

Steve Jobs' cabinet

macOS>In The News

The Future is NOT Self-Hosted

General Dev>In The News

How Anthropic teams use Claude Code

AI>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

CommunityNews

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Where Next?

Popular General Dev topics

Are top websites using WebGL for fingerprinting?

Emacs Typing Tutor

There’s No Such Thing as Clean Code

LG 28-inch 16:18 DualUp Monitor

I made a home security system, powered by a Raspberry Pi 3

Why Python keeps growing, explained

Why Python is terrible

Software engineering job openings hit five-year low?

Ladybird: Truly independent web browser

olmOCR – Open-Source OCR for Accurate Document Conversion

Other popular topics

Standing Desks

Can someone explain the -t option/flag in docker run command?

Would you use Erlang now when there is Elixir?

Hands-on Rust: Effective Learning through 2D Game Development and Play

Programming Crystal Book Club

Can AI/ML predict a lottery win?

Spotlight: Mike Riley (Author) Interview and AMA!

Spotlight: VM Brasseur (Author) Interview and AMA!

Spotlight: Tammy Coron (Author) Interview and AMA!

Spotlight: Sophie DeBenedetto (Author) Interview and AMA!

Sponsor Spotlight

General Dev>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta