SLShaun Liewinshaunliew.hashnode.dev·May 18 · 15 min readQwen3 Speculative Decoding on the DGX Spark: Two Models, Four Methods, One Surprising LessonThe Problem: One Token at a Time When you ask a large language model a question, it does not write the whole answer in a single step. It generates one small piece of text at a time. That piece is call00
SLShaun Liewinshaunliew.hashnode.dev·May 16 · 14 min readRunning Qwen3-VL on DGX Spark: Transformers vs vLLM vs SGLangI've been running Qwen3-VL locally for a while now, mostly with the standard from_pretrained() setup. It works, but it's slow. So, I kept wondering whether switching to vLLM or SGLang would actually m00
SLShaun Liewinshaunliew.hashnode.dev·Dec 7, 2025 · 4 min readFixing CUDA PTX Error When Running Qwen3-VL with vLLM on H200Running vision-language models like Qwen3-VL with vLLM on high-end GPUs should be straightforward. Except when it's not. The Problem I was setting up Qwen3-VL-8B-Instruct on our H200 cluster (8x H200, 143GB VRAM each) when I hit this error: vllm ser...00
SLShaun Liewinshaunliew.hashnode.dev·Jul 29, 2025 · 9 min readUnderstanding neural style transferNeural Style Transfer: Turning the Mona Lisa into a Picasso Have you ever wondered what the Mona Lisa would look like if Picasso had painted it? Neural style transfer makes this artistic dream a reality by combining the content of one image with the ...00
SLShaun Liewinshaunliew.hashnode.dev·Jul 29, 2025 · 10 min readPerforming an adversarial attack on imagesExploring adversarial attacks - where tiny, imperceptible modifications can completely deceive even the smartest neural networks From Autoencoders to Adversarial Attacks: A New Kind of Magic After my fascinating journey with Variational Autoencoders,...00