@lwon

Lewis Won

@lwonSingaporeJoined March 2026

About

Nothing here yet.

Available for

Nothing here yet.

Lewis Won's blogs

Clio Labscliolabs.hashnode.dev4 posts

Articles Comments

Recently published

LWLewis Woncliolabs.hashnode.devMay 17 · 19 min read

Debugging Multi-node GPU training

Table of Contents 1. The Problem 2. Physical Hardware 3. Intra-Node Communication: NVLink 4. Inter-Node Communication: InfiniBand and RDMA 5. PCIe Topology: Why GPU-NIC Affinity Matters 6. The S

LWLewis Woncliolabs.hashnode.devApr 26 · 9 min read

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine When we started debugging our AudioLLM on the Megatron trainer, our loss started at 36. This did not make sens

LWLewis Woncliolabs.hashnode.devMar 27 · 12 min read

The MDS Shim — Zero-Conversion Data Loading for 800+ Datasets

We have about 800 datasets in Mosaic MDS format, with tens of millions of multimodal samples — each one an audio clip, an instruction, and a target response — spread across thousands of compressed sha

LWLewis Woncliolabs.hashnode.devMar 20 · 11 min read

Why We Moved an AudioLLM to Megatron

We trained our 10B-parameter AudioLLM — a Whisper speech encoder fused with a Gemma2 9B text decoder — using Megatron with Mosaic Streaming to handle training data. The wall The architecture is a Whis

Lewis Won

About

Available for

Lewis Won's blogs

Recently published

Debugging Multi-node GPU training

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine

The MDS Shim — Zero-Conversion Data Loading for 800+ Datasets

Why We Moved an AudioLLM to Megatron

Search Hashnode

Lewis Won

About

Available for

Lewis Won's blogs

Recently published

Debugging Multi-node GPU training

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine

The MDS Shim — Zero-Conversion Data Loading for 800+ Datasets

Why We Moved an AudioLLM to Megatron