I built an AI Supercomputer... again (2TB RAM)

Thanks! Share it with your friends!

You disliked this video. Thanks for the feedback!

Added 3 days ago by admin

0 Views

Hey…just try Twingate….you'll never look at VPN the same: https://ntck.co/twingate-networkchuck

I built another AI supercomputer with 4 Mac Studios... but this time it actually works. Earlier this year, I clustered 5 Mac Studios and it was 91% SLOWER. Everyone said clustering was stupid. But Apple just dropped a software update that changes everything - RDMA over Thunderbolt 5. Latency dropped from 300 microseconds to 3 microseconds. Now we're running trillion-parameter models locally at speeds that actually make sense.

????????Join the NetworkChuck Academy!: https://ntck.co/NCAcademy

RESOURCES / LINKS:

Docs/walkthrough: https://github.com/theNetworkChuck/mac-studio-cluster
Exo Labs: https://github.com/exo-explore/exo
MLX (Apple's ML Framework): https://github.com/ml-explore/mlx
My First Cluster Video (the failure): https://youtu.be/Ju0ndy2kwlw
RDMA Networking Explained: https://youtu.be/fb69FyW2KLk

TIMESTAMPS:

0:00 - The $50,000 AI Supercomputer
0:53 - What Apple Changed
3:05 - Connecting the Cluster
4:17 - Pipeline vs Tensor Parallelism
7:52 - RDMA: The 100x Latency Fix
10:02 - Twingate (Sponsor)
11:39 - Exo Labs is BACK
14:42 - Single Node vs Cluster Testing
17:58 - Qwen 3 Coder 480B Testing
19:03 - Kimi K2 (1 Trillion Parameters)
21:09 - Stacking Multiple Models
25:22 - Real Apps: Open WebUI + Xcode
27:57 - Final Thoughts
28:47 - How MLX Makes This Possible

**Sponsored by Twingate

THE SPECS:
• 4x Mac Studio M4 Ultra (512GB RAM each)
• 2TB unified memory / 320 GPU cores / 32TB storage
• $50,000 (vs $780,000+ for equivalent NVIDIA H100s)

THE RESULTS:
• Llama 3.3 70B: 16 tok/s (3x faster than before)
• Kimi K2 (1T params): 28 tok/s
• DeepSeek V3.1 671B: 27 tok/s
• Qwen 3 Coder 480B: 40 tok/s

SUPPORT NETWORKCHUCK
---------------------------------------------------
???????? Sign up for NetworkChuck Academy: https://ntck.co/NCAcademy

☕☕ COFFEE and MERCH: https://ntck.co/coffee

???????? Use the MOST SECURE Web Browser, NetworkChuck Cloud Browser: https://browser.networkchuck.com/

???????? Use n8n, my favorite automation tool: https://ntck.co/n8n

???????? NEED HELP?? Join the Discord Server: https://discord.gg/networkchuck

STUDY WITH ME on Twitch: https://bit.ly/nc_twitch

READY TO LEARN??
---------------------------------------------------
-Sign up for NetworkChuck Academy: https://ntck.co/NCAcademy
-Get your CCNA: https://bit.ly/nc-ccna

FOLLOW ME EVERYWHERE
---------------------------------------------------
Instagram: https://www.instagram.com/networkchuck/
Twitter: https://twitter.com/networkchuck
Facebook: https://www.facebook.com/NetworkChuck/
Join the Discord server: http://bit.ly/nc-discord

Do you want to know how I draw on the screen?? Go to https://ntck.co/EpicPen and use code NetworkChuck to get 20% off!!

clustering works now. thank Apple and Exo Labs.

# # #

TAGS:
mac studio cluster, ai supercomputer, local ai, rdma, exo labs, apple silicon, m4 ultra, unified memory, tensor parallelism, llm, kimi k2, deepseek, llama, mlx, thunderbolt 5, home lab ai, self hosted ai, 2tb ram, gpu cluster, apple ai

Category: Artificial Intelligence

Post your comment

Comments

Be the first to comment

Sign in

Create your account

I built an AI Supercomputer... again (2TB RAM)

Post your comment

Comments

Up Next