Gemma 4 MTP Drafter on DGX Spark: 2.89x Speedup for Dense 31B — No Quality Loss
An 870 MB drafter model turned Dense 31B from 6.5 → 18.8 tok/s. No model swap, no training, no quality degradation. If you have a DGX Spark, there's no reason not to use this.
Key Results
Model
Fra
devsnack.hashnode.dev8 min read