Tag feed

#dgxspark

24 posts0 followers

Explore Hashnode

Alternatives

Trending tags this week

RCRobert Collinsselfenrichment.hashnode.devJul 1 · 8 min read

dgx-spark-inference - Keeping good habits around local inference

Local inference has a habit of becoming folklore. You launch a model from a shell history fragment you can never quite find again. A context setting lives in a note somewhere. A faster quantization ge

0

SLShaun Liewshaunliew.hashnode.devMay 18 · 15 min read

Qwen3 Speculative Decoding on the DGX Spark: Two Models, Four Methods, One Surprising Lesson

The Problem: One Token at a Time When you ask a large language model a question, it does not write the whole answer in a single step. It generates one small piece of text at a time. That piece is call

0

SLShaun Liewshaunliew.hashnode.devMay 16 · 14 min read

Running Qwen3-VL on DGX Spark: Transformers vs vLLM vs SGLang

I've been running Qwen3-VL locally for a while now, mostly with the standard from_pretrained() setup. It works, but it's slow. So, I kept wondering whether switching to vLLM or SGLang would actually m

0

강강문규devsnack.hashnode.devMay 6 · 8 min read

Every Optimized Model for NVIDIA DGX Spark GB10 — Benchmarked & Ranked (April 2026)

Got your hands on an NVIDIA DGX Spark but have no idea which models to run on it? I scoured every GB10-optimized model on Hugging Face so you don't have to. Table of Contents What Makes DGX Spark Spe

1

L

강강문규devsnack.hashnode.devMay 6 · 10 min read

Qwen3.6 on DGX Spark: vLLM + NVFP4 + DFlash vs llama.cpp — 2x Faster at 88–104 tok/s

TL;DR — I was happily running Qwen3.6 on llama.cpp. Then I saw claims of 2× speed with vLLM + NVFP4 + DFlash. So I installed it, fought through crashes, and measured it myself. Verdict: it's real. 88–

0

강강문규devsnack.hashnode.devMay 6 · 8 min read

Gemma 4 MTP Drafter on DGX Spark: 2.89x Speedup for Dense 31B — No Quality Loss

An 870 MB drafter model turned Dense 31B from 6.5 → 18.8 tok/s. No model swap, no training, no quality degradation. If you have a DGX Spark, there's no reason not to use this. Key Results Model Fra

0

SPSaiyam Pathakkubesimplify.hashnode.devApr 7 · 10 min read

SSH Into Your DGX Spark From Anywhere in the World Using Tailscale

I recently got my hands on an NVIDIA DGX Spark, and the first thing I wanted to figure out was: how do I access this thing from anywhere? Whether I'm at a coffee shop, at a conference, or on a differe

0

AAmegillasparktastic.hashnode.devApr 5 · 13 min read

Training a LoRA with Unsloth on DGX Spark

Having done a long detour through manually reviewing and finessing my training data, this weekend I was finally ready to train the LoRAs for my poster generation project. I have two planned: One to g

0

AAmegillasparktastic.hashnode.devMar 19 · 5 min read

Qwen3-VL image analysis using vLLM on DGX Spark

I'm at the stage with my poster generation project where I am analysing poster images to create training data. It turns out that vLLM is an excellent inference engine for this use case due to its supp

0

SPSaiyam Pathakkubesimplify.hashnode.devMar 14 · 15 min read

Here's What I Learned About Nemotron 3 Super -I Ran a 120B Parameter Model on Nvidia DGX Spark

There’s a moment when you’re watching a model load into memory. The progress bar is filling up to 87 gigabytes and it hits you. You’re about to talk to something that has 120 billion parameters. Not t

0

#dgxspark

Search Hashnode

#dgxspark

Explore Hashnode

Trending tags this week

dgx-spark-inference - Keeping good habits around local inference

Qwen3 Speculative Decoding on the DGX Spark: Two Models, Four Methods, One Surprising Lesson

Running Qwen3-VL on DGX Spark: Transformers vs vLLM vs SGLang

Every Optimized Model for NVIDIA DGX Spark GB10 — Benchmarked & Ranked (April 2026)

Qwen3.6 on DGX Spark: vLLM + NVFP4 + DFlash vs llama.cpp — 2x Faster at 88–104 tok/s

Gemma 4 MTP Drafter on DGX Spark: 2.89x Speedup for Dense 31B — No Quality Loss

SSH Into Your DGX Spark From Anywhere in the World Using Tailscale

Training a LoRA with Unsloth on DGX Spark

Qwen3-VL image analysis using vLLM on DGX Spark

Here's What I Learned About Nemotron 3 Super -I Ran a 120B Parameter Model on Nvidia DGX Spark