CommunityNews

CommunityNews

ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference

Transformer-based large models excel in natural language processing and computer vision, but face severe computational inefficiencies due to the self-attention’s quadratic complexity with input tokens. Recently, researchers have proposed a series of methods based on block selection and compression to alleviate this problem, but they either have issues with semantic incompleteness or poor training-inference efficiency. To comprehensively address these challenges, we propose ChunkLLM, a lightweight and pluggable training framework. Specifically, we introduce two components: QK Adapter (Q-Adapter and K-Adapter) and Chunk Adapter. The former is attached to each Transformer layer, serving dual purposes of feature compression and chunk attention acquisition. The latter operates at the bottommost layer of the model, functioning to detect chunk boundaries by leveraging contextual semantic information. During the training phase, the parameters of the backbone remain frozen, with only the QK Adapter and Chunk Adapter undergoing training. Notably, we design an attention distillation method for training the QK Adapter, which enhances the recall rate of key chunks. During the inference phase, chunk selection is triggered exclusively when the current token is detected as a chunk boundary, thereby accelerating model inference. Experimental evaluations are conducted on a diverse set of long-text and short-text benchmark datasets spanning multiple tasks. ChunkLLM not only attains comparable performance on short-text benchmarks but also maintains 98.64% of the performance on long-context benchmarks while preserving a 48.58% key-value cache retention rate. Particularly, ChunkLLM attains a maximum speedup of 4.48x in comparison to the vanilla Transformer in the processing of 120K long texts.

Read in full here:

Where Next?

Popular Ai topics Top

AstonJ
Well done DeepMind… wonder what else they’re working on… One of biology’s biggest mysteries has been solved using artificial intelligen...
New
First poster: CommunityNews
Artificial intelligence and machine learning exist on the back of a lot of hard work from humans. Alongside the scientists, there are th...
#ai
New
First poster: CommunityNews
The use of facial recognition for surveillance, or algorithms that manipulate human behaviour, will be banned under proposed EU regulatio...
New
CommunityNews
DeepMind’s New AI With a Memory Outperforms Algorithms 25 Times Its Size. DeepMind’s model, with just 7 billion parameters, outperformed...
New
First poster: CommunityNews
A new computer program fashioned after artificial intelligence systems like AlphaGo has solved several open problems in combinatorics and...
New
First poster: CommunityNews
Steve Blank Artificial Intelligence and Machine Learning– Explained. Artificial Intelligence is a once-in-a lifetime commercial and defe...
New
First poster: bot
AlphaTensor discovers better algorithms for matrix math, inspiring another improvement from afar.
New
First poster: bot
Chri Besenbruch, CEO of Deep Render, sees many problems with the way video compression standards are developed today. He thinks they aren...
New
New
First poster: ozornin
They are among 400 artists appealing to Sir Keir Starmer, saying creative industries are threatened.
New

Other popular topics Top

ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
axelson
I’ve been really enjoying obsidian.md: It is very snappy (even though it is based on Electron). I love that it is all local by defaul...
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
AstonJ
I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 28379 760
New
mafinar
This is going to be a long an frequently posted thread. While talking to a friend of mine who has taken data structure and algorithm cou...
New
PragmaticBookshelf
Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...
New
AstonJ
If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...
New