CommunityNews

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Quantizing large language models has become a standard way to reduce their memory and computational costs. Typically, existing methods focus on breaking down the problem into individual layer-wise sub-problems, and minimizing per-layer error, measured via various metrics. Yet, this approach currently lacks theoretical justification and the metrics employed may be sub-optimal. In this paper, we present a “linearity theorem” establishing a direct relationship between the layer-wise $\ell_2$ reconstruction error and the model perplexity increase due to quantization. This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, which outperforms all prior data-free approaches such as the extremely popular NF4 quantized format, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels which match a given compression constraint in the medium-bitwidth regime, obtained by reduction to dynamic programming. On the practical side, we demonstrate improved accuracy-compression trade-offs on Llama-3.1 and 3.2-family models, as well as on Qwen-family models. Further, we show that our method can be efficiently supported in terms of GPU kernels at various batch sizes, advancing both data-free and non-uniform quantization for LLMs.

Read in full here:

View thread on forum

0 146 0

2025-04-21 01:37:05 UTC

Popular General Dev topics

General Dev>In The News

It’s official. Your private communications can (and will) be spied on

It’s official. Your private communications can (and will) be spied on - European Digital Rights (EDRi). On 6 July, the European Parliame...

edri.org

41 814 18

2021-08-10 17:54:06 UTC

New

General Dev>In The News

Wearable Microphone Jamming

We engineered a wearable microphone jammer that is capable of disabling microphones in its user’s surroundings, including hidden micropho...

sandlab.cs.uchicago.edu

/security

7 1200 3

2021-10-19 22:47:50 UTC

New

General Dev>In The News

Doom-emacs: An Emacs framework

GitHub - hlissner/doom-emacs: An Emacs framework for the stubborn martian hacker. An Emacs framework for the stubborn martian hacker - G...

github.com

/emacs #doom-emacs

55 2893 16

2022-08-11 18:02:08 UTC

New

General Dev>In The News

A career ending mistake

A career ending mistake — Bitfield Consulting. As software engineers, we’re constantly making detailed, elaborate plans for computers to...

bitfieldconsulting.com

#career

22 892 8

2022-03-12 13:42:09 UTC

New

General Dev>In The News

Rui: Experimental Rust UI library, based on SwiftUI

GitHub - audulus/rui: Experimental Rust UI library. Experimental Rust UI library. Contribute to audulus/rui development by creating an a...

github.com

/rust #library

2 902 0

2022-03-01 23:20:15 UTC

New

General Dev>In The News

22 years of Emacs

How a piece of advice became a lifestyle TABLE OF CONTENTS WHERE TO BEGIN… FIRST CONTACT PICKING EMACS FOR LIFE CHEATING ON EMACS SERE...

arjenwiersma.nl

/emacs

0 920 0

2022-03-14 15:21:49 UTC

New

General Dev>In The News

LiveKit – open-source, high performance WebRTC infrastructure

GitHub - livekit/livekit: Scalable, high-performance WebRTC SFU. SDKs in JavaScript, React, React Native, Flutter, Swift, Kotlin, Unity/C...

github.com

#performance #webrtc #infrastructure

1 1040 1

2022-12-02 07:18:47 UTC

New

General Dev>In The News

A reason why Mac speakers sound better and louder than most

Hector Martin (@marcan@treehouse.systems). Attached: 1 image For those wondering why the hell we need all this safety system stuff for...

social.treehouse.systems

0 807 0

2023-02-26 14:48:41 UTC

New

General Dev>In The News

ChatML: ChatGPT API expects a structured format, called Chat Markup Language

openai-python/chatml.md at main · openai/openai-python. The OpenAI Python library provides convenient access to the OpenAI API from appl...

github.com

#chatgpt #api #chat

0 893 0

2023-03-02 14:46:31 UTC

New

General Dev>In The News

When Zig is safer and faster than Rust

When Zig is safer and faster than Rust. There are endless debates online about Rust vs. Zig, this post explores a side of the argument I...

zackoverflow.dev

/zig /rust

0 879 0

2023-03-08 15:55:05 UTC

New

Other popular topics

Backend>In The News

Bit – A modernized Git CLI written in Go

github.com

#git /go

1 5381 0

2020-10-12 18:31:12 UTC

New

General Dev>Code Editors

Poll: Which code editor do you use?

You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...

#polls #code-editors /vim /emacs /vscode #notepad /sublime-text #atom /textmate #codespaces #brackets #community /onivim #geany

105 4445 50

2024-11-24 11:59:08 UTC

New

General Dev>Hardware

Custom keyboard keycaps

There’s a whole world of custom keycaps out there that I didn’t know existed! Check out all of our Keycaps threads here: https://forum....

#hardware /keyboards #keycaps #mechanical-keyboards

15 7907 19

2023-07-27 16:30:57 UTC

New

General Dev>Hardware

GMK Serika Keycaps - Serika 2 available to order now!

I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...

/keyboards #keycaps #mechanical-keyboards

9 4202 7

2020-12-05 21:32:30 UTC

New

General Dev>Dev Chat

The V Programming Language

The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...

#programminguages /v

21 10905 7

2021-04-12 15:13:42 UTC

New

macOS>Chat

How to block any website on Mac using Little Snitch

If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...

#macos #littlesnitch #how-to

5 7505 3

2022-07-05 00:59:40 UTC

New

Android>Questions

Clipboard readtext not working in android webview

Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...

#android #clipboard

1 2946 0

2022-09-27 18:52:03 UTC

New

Community>In The Spotlight

Spotlight: Sophie DeBenedetto (Author) Interview and AMA!

Author Spotlight: Sophie DeBenedetto @SophieDeBenedetto The days of the traditional request-response web application are long gone, b...

/elixir /phoenix #liveview /book-programming-phoenix-liveview

37 3028 14

2023-10-17 17:12:53 UTC

New

Community>In The Spotlight

Spotlight: Bruce Tate (Author) Interview and AMA!

Author Spotlight: Bruce Tate @redrapids Programming languages always emerge out of need, and if that’s not always true, they’re defin...

/ruby /elixir /book-programming-phoenix-liveview /phoenix #liveview #book-seven-moreuages-in-seven-weeks #book-sevenuages-in-seven-weeks

54 4338 23

2023-10-17 17:14:03 UTC

New

Backend>Questions

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...

/rails #macos /postgresql

1 1525 1

2024-10-17 02:03:48 UTC

New

Latest in In The News

VPN firm says it didn’t know customers had lifetime subscriptions, cancels them

General Dev>In The News

System lets robots identify an object’s properties through handling

General Dev>In The News

How to title your blog post or whatever

General Dev>In The News

Build your own Siri. Locally. On-Device. No Cloud

General Dev>In The News

CrowdStrike CEO Cuts His Voting Power by 92% With Unexplained Gifts

General Dev>In The News

Something Pretty Right: A History of Visual Basic | Retool

General Dev>In The News

How Cursor Indexes Codebases Fast

General Dev>In The News

Four years of running a SaaS in a competitive market - Max Rozen

General Dev>In The News

Google will pay a $1.375 billion settlement to Texas over privacy violations

General Dev>In The News

Fast and Reliable Email Forwarding for your own domain

General Dev>In The News

General Dev In The News ❯

Latest (all)

Kotlin v2.1.21 released!

Backend>Official News

Google’s Gemma AI models surpass 150M downloads

AI>In The News

Google launches new initiative to back startups building AI

AI>In The News

VPN firm says it didn’t know customers had lifetime subscriptions, cancels them

General Dev>In The News

System lets robots identify an object’s properties through handling

General Dev>In The News

Thinking Elixir 253 - Tidewave Triumphs and App Store Rebellions

Backend>Blogs/Talks

Crystal 1.16.3 released!

Backend>Official News

React Native v0.80.0-rc.1 released!

Hybrid>Official News

Julia v1.12.0-beta3 released!

Backend>Official News

Clojure Deref (May 10, 2025)

Backend>Official News

Crystal 1.16.3 is released!

Backend>Official News

How to title your blog post or whatever

General Dev>In The News

Build your own Siri. Locally. On-Device. No Cloud

General Dev>In The News

A Conversation with an AI Expert on the Technology’s Role in Science and National Security

AI>In The News

Microsoft and OpenAI may be renegotiating their partnership

AI>In The News

View all threads ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

CommunityNews

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Popular General Dev topics

It’s official. Your private communications can (and will) be spied on

Wearable Microphone Jamming

Doom-emacs: An Emacs framework

A career ending mistake

Rui: Experimental Rust UI library, based on SwiftUI

22 years of Emacs

LiveKit – open-source, high performance WebRTC infrastructure

A reason why Mac speakers sound better and louder than most

ChatML: ChatGPT API expects a structured format, called Chat Markup Language

When Zig is safer and faster than Rust

Other popular topics

Bit – A modernized Git CLI written in Go

Poll: Which code editor do you use?

Custom keyboard keycaps

GMK Serika Keycaps - Serika 2 available to order now!

The V Programming Language

How to block any website on Mac using Little Snitch

Clipboard readtext not working in android webview

Spotlight: Sophie DeBenedetto (Author) Interview and AMA!

Spotlight: Bruce Tate (Author) Interview and AMA!

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

Latest in In The News

Latest (all)

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta