AstonJ

AstonJ

DeepSeek (671B) running on a cluster of 8 Mac Mini Pros with 64GB RAM each

This is cool!

DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON

We just witnessed something incredible: the largest open-source language model flexing its muscles on Apple Silicon. We’re talking about the massive DeepSeek-V3 on M4 Mac, specifically the 671 billion parameter model running on a cluster of 8 M4 Pro Mac Minis with 64GB of RAM each – that’s a whopping 512GB of combined memory!

This isn’t just about bragging rights. It opens up new possibilities for researchers, developers, and anyone interested in pushing the boundaries of AI. Let’s dive into the details and see why DeepSeek-V3 on M4 Mac is such a big deal.

TABLE OF CONTENTS

First Post!

AstonJ

AstonJ

We just got the biggest open-source model running on Apple Silicon.

Without further ado, here are the results running DeepSeek v3 (671B) on a 8 x M4 Pro 64GB Mac Mini Cluster (512GB total memory):

Model Time-To-First-Token (TTFT) in seconds Tokens-Per-Second (TPS)
DeepSeek V3 671B (4-bit) 2.91 5.37
Llama 3.1 405B (4-bit) 29.71 0.88
Llama 3.3 70B (4-bit) 3.14 3.89

Wait, Deepseek has 671B parameters and runs faster than Llama 70B?

Yes!

Let me explain…

Where Next?

Popular Ai topics Top

First poster: bot
NVIDIA Doubles Down: Announces A100 80GB GPU, Supercharging World’s Most Powerful GPU for AI Supercomputing. SC20—NVIDIA today unveiled ...
New
AstonJ
Well done DeepMind… wonder what else they’re working on… One of biology’s biggest mysteries has been solved using artificial intelligen...
New
First poster: CommunityNews
AI models are increasingly applied in high-stakes domains like health and conservation. Data quality carries an elevated signifi- cance i...
New
New
First poster: bot
A research group has taught AI to magnetically wrangle a high-powered stream of plasma used for fusion research — but wait! Put away your...
New
First poster: bot
Building games and apps entirely through natural language using OpenAI’s code-davinci model. TL;DR: OpenAI has a new code generating mod...
New
New
First poster: bot
Technique could allow high-quality calls and music on low-quality connections.
New
CommunityNews
AI supercomputer will use “tens of thousands” of Nvidia A100 and H100 GPUs.
New
New

Other popular topics Top

PragmaticBookshelf
Write Elixir tests that you can be proud of. Dive into Elixir’s test philosophy and gain mastery over the terminology and concepts that u...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
AstonJ
Was just curious to see if any were around, found this one: I got 51/100: Not sure if it was meant to buy I am sure at times the b...
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
New
DevotionGeo
I have always used antique keyboards like Cherry MX 1800 or Cherry MX 8100 and almost always have modified the switches in some way, like...
New
New
RobertRichards
Hair Salon Games for Girls Fun Girls Hair Saloon game is mainly developed for kids. This game allows users to select virtual avatars to ...
New
PragmaticBookshelf
A concise guide to MySQL 9 database administration, covering fundamental concepts, techniques, and best practices. Neil Smyth MySQL...
New