CommunityNews

CommunityNews

Benchmarking Neural Network Training Algorithms

Benchmarking Neural Network Training Algorithms.
Training algorithms, broadly construed, are an essential part of every deep
learning pipeline. Training algorithm improvements that speed up training
across a wide variety of workloads (e.g., better update rules, tuning
protocols, learning rate schedules, or data selection schemes) could save time,
save computational resources, and lead to better, more accurate, models.
Unfortunately, as a community, we are currently unable to reliably identify
training algorithm improvements, or even determine the state-of-the-art
training algorithm. In this work, using concrete experiments, we argue that
real progress in speeding up training requires new benchmarks that resolve
three basic challenges faced by empirical comparisons of training algorithms:
(1) how to decide when training is complete and precisely measure training
time, (2) how to handle the sensitivity of measurements to exact workload
details, and (3) how to fairly compare algorithms that require hyperparameter
tuning. In order to address these challenges, we introduce a new, competitive,
time-to-result benchmark using multiple workloads running on fixed hardware,
the AlgoPerf: Training Algorithms benchmark. Our benchmark includes a set of
workload variants that make it possible to detect benchmark submissions that
are more robust to workload changes than current widely-used methods. Finally,
we evaluate baseline submissions constructed using various optimizers that
represent current practice, as well as other optimizers that have recently
received attention in the literature. These baseline results collectively
demonstrate the feasibility of our benchmark, show that non-trivial gaps
between methods exist, and set a provisional state-of-the-art for future
benchmark submissions to try and surpass.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: dimitarvp
On Wednesday last week, Google’s Fiona Cicconi wrote to company employees. She announced that Google was bringing forward its timetable ...
New
First poster: bot
SPWN is a programming language that compiles to Geometry Dash levels. What that means is that you can create levels by using not only the...
New
First poster: bot
The overengineered Solution to my Pigeon Problem. TL;DR: I built a wifi-equipped water gun to shoot the pigeons on my balcony, controlle...
New
CommunityNews
GitHub - livekit/livekit: Scalable, high-performance WebRTC SFU. SDKs in JavaScript, React, React Native, Flutter, Swift, Kotlin, Unity/C...
New
CommunityNews
Docker on MacOS is slow and how to fix it. Thanks to the DALL·E 2, we finally have a very nice graphic representation of the feelings of...
New
First poster: bot
When Zig is safer and faster than Rust. There are endless debates online about Rust vs. Zig, this post explores a side of the argument I...
New
First poster: jkdiaz
Dark mode isn’t as good for your eyes as you believe. The shadowy display mode has leagues of fans claiming it helps reduce eye strain, ...
New
First poster: AstonJ
On the benefits of learning in public. Learning in public helps me grow as an engineer and seems to benefit others too. Here’s why I sho...
New
First poster: alvinkatojr
Over the last decade, we’ve seen great advancements in distributed systems, but the way we program them has seen few fundamental improvem...
New
First poster: braycarla
In beginning the NVIDIA Blackwell Linux testing with the GeForce RTX 5090 compute performance, besides all the CUDA/OpenCL/OptiX benchmar...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
DevotionGeo
I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...
New
Exadra37
Please tell us what is your preferred monitor setup for programming(not gaming) and why you have chosen it. Does your monitor have eye p...
New
New
Exadra37
I am asking for any distro that only has the bare-bones to be able to get a shell in the server and then just install the packages as we ...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
PragmaticBookshelf
Develop, deploy, and debug BEAM applications using BEAMOps: a new paradigm that focuses on scalability, fault tolerance, and owning each ...
New
PragmaticBookshelf
Fight complexity and reclaim the original spirit of agility by learning to simplify how you develop software. The result: a more humane a...
New