AstonJ

AstonJ

DeepSeek (671B) running on a cluster of 8 Mac Mini Pros with 64GB RAM each

This is cool!

DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON

We just witnessed something incredible: the largest open-source language model flexing its muscles on Apple Silicon. We’re talking about the massive DeepSeek-V3 on M4 Mac, specifically the 671 billion parameter model running on a cluster of 8 M4 Pro Mac Minis with 64GB of RAM each – that’s a whopping 512GB of combined memory!

This isn’t just about bragging rights. It opens up new possibilities for researchers, developers, and anyone interested in pushing the boundaries of AI. Let’s dive into the details and see why DeepSeek-V3 on M4 Mac is such a big deal.

TABLE OF CONTENTS

First Post!

AstonJ

AstonJ

We just got the biggest open-source model running on Apple Silicon.

Without further ado, here are the results running DeepSeek v3 (671B) on a 8 x M4 Pro 64GB Mac Mini Cluster (512GB total memory):

Model Time-To-First-Token (TTFT) in seconds Tokens-Per-Second (TPS)
DeepSeek V3 671B (4-bit) 2.91 5.37
Llama 3.1 405B (4-bit) 29.71 0.88
Llama 3.3 70B (4-bit) 3.14 3.89

Wait, Deepseek has 671B parameters and runs faster than Llama 70B?

Yes!

Let me explain…

Popular Ai topics Top

First poster: CommunityNews
Google has unveiled a tool that uses artificial intelligence to help spot skin, hair and nail conditions, based on images uploaded by pat...
New
First poster: bot
Use AI to turn simple brushstrokes into realistic landscape images. Create backgrounds quickly, or speed up your concept exploration so y...
New
First poster: bot
An ancient language has defied decryption for 100 years. Can AI crack the code?. Machine learning can translate between two known langua...
New
First poster: bot
It hopes the platform will let firms train drones and air taxis virtually without risking crashes.
New
First poster: bot
Adept’s ACT-1 has learned how to automate complex UI tasks in web apps using an AI model.
New
First poster: bot
“Whisper” open source model may become a building block in future speech-to-text apps.
New
First poster: bot
AlphaTensor discovers better algorithms for matrix math, inspiring another improvement from afar.
New
First poster: bot
Automatically add color to old photos, then refine the colors with a written caption.
New
First poster: bot
Rise of Generative AI will be comparable to the rise of CGI in the early 90s. My chat with Cristóbal Valenzuela, Cofounder & CEO of ...
New
First poster: bot
Refinement to AI language model generates rhyming compositions in various styles.
New

Other popular topics Top

AstonJ
What chair do you have while working… and why? Is there a ‘best’ type of chair or working position for developers?
New
wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
AstonJ
Inspired by this post from @Carter, which languages, frameworks or other tech or tools do you think is killing it right now? :upside_down...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
New
First poster: joeb
The File System Access API with Origin Private File System. WebKit supports new API that makes it possible for web apps to create, open,...
New
First poster: bot
The overengineered Solution to my Pigeon Problem. TL;DR: I built a wifi-equipped water gun to shoot the pigeons on my balcony, controlle...
New
PragmaticBookshelf
Author Spotlight Rebecca Skinner @RebeccaSkinner Welcome to our latest author spotlight, where we sit down with Rebecca Skinner, auth...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
First poster: bot
Large Language Models like ChatGPT say The Darnedest Things. The Errors They MakeWhy We Need to Document Them, and What We Have Decided ...
New