Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Tag feed

#megatron

3 posts·0 followers

Articles

Trending tags this week

Search Hashnode

Search posts, tags, users, and pages

Tag feed

#megatron

3 posts·0 followers

Articles

Trending tags this week

Explore Hashnode

Alternatives

Hashnode vs Medium
Hashnode vs WordPress
Hashnode vs Ghost
Hashnode vs Substack
Hashnode vs Notion
Hashnode vs Dev.to
All alternatives

Changelog
Sitemap
Terms
Privacy

LWLewis Wonincliolabs.hashnode.dev·Apr 26 · 9 min read

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine

From Loss=36 to Convergence: Integrating Whisper+Gemma2 into Megatron's TransformerEngine When we started debugging our AudioLLM on the Megatron trainer, our loss started at 36. This did not make sens

LWLewis Wonincliolabs.hashnode.dev·Mar 27 · 12 min read

The MDS Shim — Zero-Conversion Data Loading for 800+ Datasets

We have about 800 datasets in Mosaic MDS format, with tens of millions of multimodal samples — each one an audio clip, an instruction, and a target response — spread across thousands of compressed sha

LWLewis Wonincliolabs.hashnode.dev·Mar 20 · 11 min read

Why We Moved an AudioLLM to Megatron

We trained our 10B-parameter AudioLLM — a Whisper speech encoder fused with a Gemma2 9B text decoder — using Megatron with Mosaic Streaming to handle training data. The wall The architecture is a Whis