AstonJ

AstonJ

DeepSeek (671B) running on a cluster of 8 Mac Mini Pros with 64GB RAM each

This is cool!

DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON

We just witnessed something incredible: the largest open-source language model flexing its muscles on Apple Silicon. We’re talking about the massive DeepSeek-V3 on M4 Mac, specifically the 671 billion parameter model running on a cluster of 8 M4 Pro Mac Minis with 64GB of RAM each – that’s a whopping 512GB of combined memory!

This isn’t just about bragging rights. It opens up new possibilities for researchers, developers, and anyone interested in pushing the boundaries of AI. Let’s dive into the details and see why DeepSeek-V3 on M4 Mac is such a big deal.

TABLE OF CONTENTS

First Post!

AstonJ

AstonJ

We just got the biggest open-source model running on Apple Silicon.

Without further ado, here are the results running DeepSeek v3 (671B) on a 8 x M4 Pro 64GB Mac Mini Cluster (512GB total memory):

Model Time-To-First-Token (TTFT) in seconds Tokens-Per-Second (TPS)
DeepSeek V3 671B (4-bit) 2.91 5.37
Llama 3.1 405B (4-bit) 29.71 0.88
Llama 3.3 70B (4-bit) 3.14 3.89

Wait, Deepseek has 671B parameters and runs faster than Llama 70B?

Yes!

Let me explain…

Where Next?

Popular Ai topics Top

New
First poster: CommunityNews
SOME OF THE most dazzling recent advances in artificial intelligence have come thanks to resources only available at big tech companies, ...
New
First poster: CommunityNews
BROKEN PROMISES & EMPTY THREATS: THE EVOLUTION OF AI IN THE USA, 1956-1996 Artificial Intelligence (AI) is once again a promising tec...
New
CommunityNews
Artificial intelligence is now smart enough to write tracks that earn streaming service royalties.
New
First poster: OvermindDL1
Equity, the performing arts workers union, says actors need protection from computer-generated substitutes.
New
First poster: CommunityNews
Steve Blank Artificial Intelligence and Machine Learning– Explained. Artificial Intelligence is a once-in-a lifetime commercial and defe...
New
First poster: bot
AI Wrote and Performed a Jerry Seinfeld Routine!. I used GPT-3 to write a Jerry Seinfeld stand-up routine about cats - and then used Dee...
New
CommunityNews
We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understandin...
New
First poster: bot
You can’t solve AI security problems with more AI. One of the most common proposed solutions to prompt injection attacks (where an AI la...
New
First poster: bot
Ghostwriter - Code faster with AI. An AI pair programmer that helps you write better code, faster.
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
PragmaticBookshelf
Create efficient, elegant software tests in pytest, Python's most powerful testing framework. Brian Okken @brianokken Edited by Kat...
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
New
New
First poster: AstonJ
Jan | Rethink the Computer. Jan turns your computer into an AI machine by running LLMs locally on your computer. It’s a privacy-focus, l...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
PragmaticBookshelf
Get the comprehensive, insider information you need for Rails 8 with the new edition of this award-winning classic. Sam Ruby @rubys ...
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New