Tag feed

#mlx

16 posts0 followers

Trending tags this week

RCRobert Collinsselfenrichment.hashnode.dev

BF16 Edge Cases When Porting CUDA-Oriented TTS Models to MPS and MLX-Audio

13h ago · 8 min read · When porting CUDA-oriented TTS models to Apple Silicon, one subtle but important issue is BF16 behavior on Metal. This has shown up in DiaTTS work on MPS, and the same class of issue appears relevant

Join discussion

RCRobert Collinsselfenrichment.hashnode.dev

0

Revisiting MetalGraph: What I Learned About MLX, Replay, and Sidecar Performance

May 11 · 16 min read · To set the stage, this post covers follow-up work that made MetalGraph feel less like it is doomed to be only a pure investigation tool and more like something that could be useful in practice. MetalG

Join discussion

RCRobert Collinsselfenrichment.hashnode.dev

0

The Honest Limits Reached With MetalGraph

May 10 · 14 min read · Metal Graph is an early-stage reusable explicit graph execution API over Apple Metal. MetalGraph started with a simple question: can I (or rather, chatGPT and I) build something graph-like around Appl

Join discussion

SNSaloni Narangkubesimplify.hashnode.dev

1

Day 6: Run an LLM on Your Laptop - With Docker

Apr 30 · 10 min read · 7 Days of Docker (2026) - A Docker Captain's guide. Not your average tutorial. I'm a Docker Captain. And if you'd told me two years ago that I'd be pulling AI models from Docker Hub the same way I pu

EEthan commented

SSleepyQuantsleepyquant.hashnode.dev

0

I Run a 40GB AI Model on a MacBook. Three Months of MLX on M1 Max Has Changed How I Think About Apple Silicon.

Apr 23 · 8 min read · I Run a 40GB AI Model on a MacBook. Three Months of MLX on M1 Max Has Changed How I Think About Apple Silicon. It's Just a Laptop. But It's Running a 40GB Model Right Now. I'm drafting this on a MacBook Pro. Qwen 3.6 35B-A3B MoE Q8 — about 40GB of we...

Join discussion

MMMatt Macoskoineedhemp.hashnode.dev

0

Free AI on a MacBook vs $100-a-Month Claude Code — Hexagon Shootout

Apr 23 · 5 min read · ▶ Watch the race on YouTube: https://www.youtube.com/watch?v=2KeTDDodE0A April 22, 2026. Anthropic's Claude Code Max plan jumped to $100 a month. I ran a live three-way AI race on the exact same prompt — Gemma 31B local, Llama 70B local, and Claude...

Join discussion

AWAlan Westalan-west.hashnode.dev

0

Ollama Just Got 93% Faster on Mac. Here's How to Enable It.

Apr 6 · 5 min read · My M4 Max was decoding Qwen3.5 at 58 tokens per second yesterday. Today it's doing 112. Same model, same hardware, same prompt. The only thing that changed was a single environment variable. Ollama 0.19 shipped on March 31, 2026 with a preview of its...

Join discussion

FEFazm Engineeringfazm.hashnode.dev

0

On-Device AI on Apple Silicon - What It Means for Desktop Agents

Mar 18 · 3 min read · Apple Silicon changed what is possible for local AI. The unified memory architecture means ML models can run on the GPU without copying data between CPU and GPU memory. For a desktop agent that needs to process screen content in real-time, this matte...

Join discussion

AJAKSHAY JADHAVakshayjadhav.hashnode.dev

0

I Tried Running FunctionGemma On-Device in a React Native App. Here's How It Went.

Mar 14 · 8 min read · It started with a tweet. Google Devs posted a demo of FunctionGemma running a game, and I watched this tiny model parse natural language into structured function calls in real time. My immediate thoug

Join discussion

MKMykhailo Kapustinkapustinomm.hashnode.dev

0

How to Run LLMs Locally with LM Studio: Complete Guide 2026

Feb 28 · 12 min read · My name is Mykhailo Kapustin, and I am Co-Founder & CTO at Advanced Scientific Research Projects (ASRP). Over the past decade, I've worked across the full technology stack — from frontend and backend

Join discussion

#mlx

Search Hashnode

#mlx

Trending tags this week

BF16 Edge Cases When Porting CUDA-Oriented TTS Models to MPS and MLX-Audio

Revisiting MetalGraph: What I Learned About MLX, Replay, and Sidecar Performance

The Honest Limits Reached With MetalGraph

Day 6: Run an LLM on Your Laptop - With Docker

I Run a 40GB AI Model on a MacBook. Three Months of MLX on M1 Max Has Changed How I Think About Apple Silicon.

Free AI on a MacBook vs $100-a-Month Claude Code — Hexagon Shootout

Ollama Just Got 93% Faster on Mac. Here's How to Enable It.

On-Device AI on Apple Silicon - What It Means for Desktop Agents

I Tried Running FunctionGemma On-Device in a React Native App. Here's How It Went.

How to Run LLMs Locally with LM Studio: Complete Guide 2026