CommunityNews

CommunityNews

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

In this technical report, we tackle the challenges of training large-scale Mixture of Experts (MoE) models, focusing on overcoming cost inefficiency and resource limitations prevalent in such systems. To address these issues, we present two differently sized MoE large language models (LLMs), namely Ling-Lite and Ling-Plus (referred to as “Bailing” in Chinese, spelled Bǎilíng in Pinyin). Ling-Lite contains 16.8 billion parameters with 2.75 billion activated parameters, while Ling-Plus boasts 290 billion parameters with 28.8 billion activated parameters. Both models exhibit comparable performance to leading industry benchmarks. This report offers actionable insights to improve the efficiency and accessibility of AI development in resource-constrained settings, promoting more scalable and sustainable technologies. Specifically, to reduce training costs for large-scale MoE models, we propose innovative methods for (1) optimization of model architecture and training processes, (2) refinement of training anomaly handling, and (3) enhancement of model evaluation efficiency. Additionally, leveraging high-quality data generated from knowledge graphs, our models demonstrate superior capabilities in tool use compared to other models. Ultimately, our experimental findings demonstrate that a 300B MoE LLM can be effectively trained on lower-performance devices while achieving comparable performance to models of a similar scale, including dense and MoE models. Compared to high-performance devices, utilizing a lower-specification hardware system during the pre-training phase demonstrates significant cost savings, reducing computing costs by approximately 20%. The models can be accessed at inclusionAI (inclusionAI).

Read in full here:

Where Next?

Popular General Dev topics Top

First poster: dimitarvp
A career ending mistake — Bitfield Consulting. As software engineers, we’re constantly making detailed, elaborate plans for computers to...
New
OvermindDL1
Yet another rust-made text editor, though I’m really liking the looks of how this one works!
New
CommunityNews
…or, “why make programming even harder?” Learning functional programming is an opportunity to discover a new way to represent programs, t...
New
First poster: bot
sqlglot/python_sql_engine.md at main · tobymao/sqlglot. Python SQL Parser and Transpiler. Contribute to tobymao/sqlglot development by c...
New
First poster: bot
Hector Martin (@marcan@treehouse.systems). Attached: 1 image For those wondering why the hell we need all this safety system stuff for...
New
First poster: bot
openai-python/chatml.md at main · openai/openai-python. The OpenAI Python library provides convenient access to the OpenAI API from appl...
New
First poster: KnowledgeIsPower
Building a Slack/Discord alternative with Tauri/Rust linen <span class="hashtag-icon-placeholder"></span>blog. Introduction My name is K...
New
CommunityNews
9 fintech engineering mistakes. Read this list unless you want to build a money dissappearing system
New
First poster: alvinkatojr
About accelerationism, NRx, and the intersection of technology, religion, and philosophy: an analysis of the essential ideas in the new A...
New
CommunityNews
GitSyncPad is an innovative micro keypad designed for effortless Git version control. Execute commands like git add, git commit, and git ...
New

Other popular topics Top

AstonJ
If it’s a mechanical keyboard, which switches do you have? Would you recommend it? Why? What will your next keyboard be? Pics always w...
New
wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
AstonJ
I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...
New
PragmaticBookshelf
Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...
New
AstonJ
If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...
New
PragmaticBookshelf
Author Spotlight Jamis Buck @jamis This month, we have the pleasure of spotlighting author Jamis Buck, who has written Mazes for Prog...
New
DevotionGeo
I have always used antique keyboards like Cherry MX 1800 or Cherry MX 8100 and almost always have modified the switches in some way, like...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
AnfaengerAlex
Hello, I’m a beginner in Android development and I’m facing an issue with my project setup. In my build.gradle.kts file, I have the foll...
New