Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

FeedDiscussion

Varada Santosh

Nov 27, 2025

Understanding CUDA GEMM: Foundations for Optimization

In our previous blog, we explored GPU computing fundamentals: memory hierarchies, thread organization, warps, memory coalescing, and kernel classification (memory-bound vs. compute-bound). In this blog, we apply these concepts to optimize GEMM (Gener...

vvnasantosh.hashnode.dev43 min read

#cuda-gemm #cutlass #cuda #cudathread #gemm #gpu-nvidia-amd #ai #ml

Responses

No responses yet.