CommunityNews

Scalable extraction of training data from (production) language models

Scalable Extraction of Training Data from (Production) Language Models.
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

View thread on forum

#production #training

0 391 0

2023-12-03 03:11:47 UTC

Where Next?

View thread on forum

production

training

Home General Dev>In The News

#production #training

0 391 0

Last post

Popular General Dev topics

General Dev>In The News

I am lonely will anyone speak to me

en.wikipedia.org

/diversity #mental-health

0 1415 1

2020-12-26 08:45:20 UTC

New

General Dev>In The News

Safari now supports File System Access API with private origin

The File System Access API with Origin Private File System. WebKit supports new API that makes it possible for web apps to create, open,...

webkit.org

#api #safari

43 3467 21

2022-03-03 12:49:07 UTC

New

General Dev>In The News

I made a home security system, powered by a Raspberry Pi 3

Raspberry Pi security alarm — the basics. In November last year — I started building a DIY security alarm system, using a Raspberry Pi a...

blog.cavelab.dev

/security

0 2111 0

2023-01-01 15:50:18 UTC

New

General Dev>In The News

Fintech engineering mistakes

9 fintech engineering mistakes. Read this list unless you want to build a money dissappearing system

startupwin.kelsus.com

0 1631 0

2023-06-28 15:09:41 UTC

New

General Dev>In The News

Testing Intel’s Arc A770 GPU for Deep Learning

Christian Mills - Testing Intel’s Arc A770 GPU for Deep Learning Pt. 2. This post covers my experience training image classification mod...

christianjmills.com

#testing #learning #intel

0 1739 0

2023-08-09 15:00:13 UTC

New

General Dev>In The News

Apple Patents Suggest Future AirPods Could Monitor Biosignals and Brain Activity

Apple Patents Suggest Future AirPods Could Monitor Biosignals & Brain Activity - AppleMagazine. The US Patent & Trademark Office...

applemagazine.com

#apple #monitor

0 1285 0

2023-10-11 01:56:37 UTC

New

General Dev>In The News

X can’t stop spread of explicit, fake AI Taylor Swift images

Will Swifties’ war on AI fakes spark a deepfake porn reckoning?

arstechnica.com

/swift

0 7404 0

2024-01-26 05:47:12 UTC

New

General Dev>In The News

Self-Hosting a Firefox Sync Server

After switching from Firefox to LibreWolf, I became interested in the idea of self-hosting my own Firefox Sync server. Although I had see...

blog.diego.dev

#hosting #firefox

0 615 0

2025-03-09 03:43:04 UTC

New

General Dev>In The News

The A.I. Monarchy

About accelerationism, NRx, and the intersection of technology, religion, and philosophy: an analysis of the essential ideas in the new A...

substack.com

2 472 1

2025-03-11 21:27:01 UTC

New

General Dev>In The News

Llama.cpp AI Performance with the GeForce RTX 5090 Review

In beginning the NVIDIA Blackwell Linux testing with the GeForce RTX 5090 compute performance, besides all the CUDA/OpenCL/OptiX benchmar...

phoronix.com

#performance #cpp #llama #geforce

0 911 1

2025-03-21 12:10:45 UTC

New

Other popular topics

Backend>Learning Resources

Seven Languages in Seven Weeks

Ruby, Io, Prolog, Scala, Erlang, Clojure, Haskell. With Seven Languages in Seven Weeks, by Bruce A. Tate, you’ll go beyond the syntax—and...

pragprog.com

#pragprog /clojure /erlang /haskell /prolog /ruby /scala #published-book /book-seven-languages-in-seven-weeks

5 4330 1

2022-01-20 13:48:55 UTC

New

Backend>Chat

What is the reason behind Rust’s web framework, Rocket, not performing as well as expected in the Techempower benchmarks?

I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...

#web-frameworks /rust

36 7115 11

2020-06-21 10:50:02 UTC

New

General Dev>Code Editors

SpaceVim vs SpaceMacs

SpaceVim seems to be gaining in features and popularity and I just wondered how it compares with SpaceMacs in 2020 - anyone have any thou...

/vim #spacevim #spacemacs /emacs #code-editors

30 3990 14

2020-08-27 17:53:29 UTC

New

Backend>Chat

Rails console using 100% CPU in dev (fix)

If you are experiencing Rails console using 100% CPU on your dev machine, then updating your development and test gems might fix the issu...

/ruby /rails

3 3971 3

2021-02-04 07:08:45 UTC

New

Backend>Learning Resources

Agile Web Development with Rails 7

Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...

pragprog.com

#pragprog #web-development /ruby /rails #published-book /book-agile-web-development-with-rails-7

32 5327 9

2022-01-26 18:28:44 UTC

New

Community>In The Spotlight

Spotlight: Rebecca Skinner (Author) Interview and AMA!

Author Spotlight Rebecca Skinner @RebeccaSkinner Welcome to our latest author spotlight, where we sit down with Rebecca Skinner, auth...

#author-spotlight /haskell /book-effective-haskell

106 10968 28

2022-11-16 10:29:37 UTC

New

AI>In The News

How to fix the eyes in AI-generated images

aidemos.info

0 3922 0

2022-09-10 13:54:33 UTC

New

Community>In The Spotlight

Spotlight: Bruce Tate (Author) Interview and AMA!

Author Spotlight: Bruce Tate @redrapids Programming languages always emerge out of need, and if that’s not always true, they’re defin...

/elixir /ruby /phoenix /book-seven-more-languages-in-seven-weeks /book-seven-languages-in-seven-weeks #liveview /book-programming-phoenix-liveview

54 4972 23

2023-10-17 17:14:03 UTC

New

Game Dev>Chat

Hair Salon Games for Girls Fun

Hair Salon Games for Girls Fun Girls Hair Saloon game is mainly developed for kids. This game allows users to select virtual avatars to ...

#ios #android #unity #fun

2 2088 1

2025-02-27 10:48:33 UTC

New

Backend>Learning Resources

MySQL 9 Essentials

A concise guide to MySQL 9 database administration, covering fundamental concepts, techniques, and best practices. Neil Smyth MySQL...

pragprog.com

#pragprog #published-book /mysql /book-mysql-9-essentials

2 2579 0

2025-03-12 13:05:49 UTC

New

Scalable extraction of training data from (production) language models

CommunityNews

Scalable extraction of training data from (production) language models

Where Next?

Popular General Dev topics

I am lonely will anyone speak to me

Safari now supports File System Access API with private origin

I made a home security system, powered by a Raspberry Pi 3

Fintech engineering mistakes

Testing Intel’s Arc A770 GPU for Deep Learning

Apple Patents Suggest Future AirPods Could Monitor Biosignals and Brain Activity

X can’t stop spread of explicit, fake AI Taylor Swift images

Self-Hosting a Firefox Sync Server

The A.I. Monarchy

Llama.cpp AI Performance with the GeForce RTX 5090 Review

Other popular topics

Seven Languages in Seven Weeks

What is the reason behind Rust’s web framework, Rocket, not performing as well as expected in the Techempower benchmarks?

SpaceVim vs SpaceMacs

Rails console using 100% CPU in dev (fix)

Agile Web Development with Rails 7

Spotlight: Rebecca Skinner (Author) Interview and AMA!

How to fix the eyes in AI-generated images

Spotlight: Bruce Tate (Author) Interview and AMA!

Hair Salon Games for Girls Fun

MySQL 9 Essentials

General Dev>In The News

Latest on Devtalk

We ❤️ helpful members!

Categories:

Sub Categories:

Popular Portals

We're in Beta

Scalable extraction of training data from (production) language models

CommunityNews

Scalable extraction of training data from (production) language models

Where Next?

Popular General Dev topics

I am lonely will anyone speak to me

Safari now supports File System Access API with private origin

I made a home security system, powered by a Raspberry Pi 3

Fintech engineering mistakes

Testing Intel’s Arc A770 GPU for Deep Learning

Apple Patents Suggest Future AirPods Could Monitor Biosignals and Brain Activity

X can’t stop spread of explicit, fake AI Taylor Swift images

Self-Hosting a Firefox Sync Server

The A.I. Monarchy

Llama.cpp AI Performance with the GeForce RTX 5090 Review

Other popular topics

Seven Languages in Seven Weeks

What is the reason behind Rust’s web framework, Rocket, not performing as well as expected in the Techempower benchmarks?

SpaceVim vs SpaceMacs

Rails console using 100% CPU in dev (fix)

Agile Web Development with Rails 7

Spotlight: Rebecca Skinner (Author) Interview and AMA!

How to fix the eyes in AI-generated images

Spotlight: Bruce Tate (Author) Interview and AMA!

Hair Salon Games for Girls Fun

MySQL 9 Essentials

Sponsor Spotlight

General Dev>In The News

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta