CommunityNews

CommunityNews

Adaptive LLM Routing under Budget Constraints

Large Language Models (LLMs) have revolutionized natural language processing, but their varying capabilities and costs pose challenges in practical applications. LLM routing addresses this by dynamically selecting the most suitable LLM for each query/task. Previous approaches treat this as a supervised learning problem, assuming complete knowledge of optimal query-LLM pairings. However, real-world scenarios lack such comprehensive mappings and face evolving user queries. We thus propose to study LLM routing as a contextual bandit problem, enabling adaptive decision-making using bandit feedback without requiring exhaustive inference across all LLMs for all queries (in contrast to supervised routing). To address this problem, we develop a shared embedding space for queries and LLMs, where query and LLM embeddings are aligned to reflect their affinity. This space is initially learned from offline human preference data and refined through online bandit feedback. We instantiate this idea through Preference-prior Informed Linucb fOr adaptive rouTing (PILOT), a novel extension of LinUCB. To handle diverse user budgets for model routing, we introduce an online cost policy modeled as a multi-choice knapsack problem, ensuring resource-efficient routing.

Read in full here:

Where Next?

Popular Ai topics Top

First poster: bot
NVIDIA Doubles Down: Announces A100 80GB GPU, Supercharging World’s Most Powerful GPU for AI Supercomputing. SC20—NVIDIA today unveiled ...
New
AstonJ
Well done DeepMind… wonder what else they’re working on… One of biology’s biggest mysteries has been solved using artificial intelligen...
New
First poster: CommunityNews
Artificial intelligence and machine learning exist on the back of a lot of hard work from humans. Alongside the scientists, there are th...
#ai
New
First poster: jss
We are in the middle of an AI boom. Machine Learning experts command extraordinary salaries, investors are happy to open their hearts and...
New
CommunityNews
Artificial intelligence is now smart enough to write tracks that earn streaming service royalties.
New
New
First poster: CommunityNews
Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and se...
New
First poster: DevotionGeo
Voice synthesis PR stunt calls upon the dead to help sell an AI product.
New
First poster: bot
Technique could allow high-quality calls and music on low-quality connections.
New
First poster: AstonJ
From fear to optimism: why I am convinced AI is worth embracing.
New

Other popular topics Top

PragmaticBookshelf
Brace yourself for a fun challenge: build a photorealistic 3D renderer from scratch! In just a couple of weeks, build a ray tracer that r...
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
New
Exadra37
I am asking for any distro that only has the bare-bones to be able to get a shell in the server and then just install the packages as we ...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
CommunityNews
A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...
New
PragmaticBookshelf
Get the comprehensive, insider information you need for Rails 8 with the new edition of this award-winning classic. Sam Ruby @rubys ...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New