Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

@Jaygala223

Jay Gala

@Jaygala223·Mumbai, India·Joined December 2022

AI Software Engineer at Intel

About

Currently working as an AI Software Engineer at Intel

Available for

Reach out to me on Linkedin or email me at jaygala260@gmail.com

Jay Gala's blogs

Gala codesgalacodes.hashnode.dev15 posts

About

Currently working as an AI Software Engineer at Intel

Available for

Reach out to me on Linkedin or email me at jaygala260@gmail.com

Jay Gala's blogs

Gala codesgalacodes.hashnode.dev15 posts

Articles Comments2

Recently published

JGJay Galaingalacodes.hashnode.dev·Jun 23 · 15 min read

I Wrote a GPU Matmul Kernel From Scratch in Triton. Here's Everything I Learned

I recently started learning Triton, OpenAI's Python-based language for writing GPU kernels. My project: build a matrix multiplication kernel from scratch, step by step, until it's competitive with PyT

JGJay Galaingalacodes.hashnode.dev·Nov 30, 2025 · 18 min read

GPUs: The Hardware That Power AI

You've probably used ChatGPT or Claude. Maybe you've even fine-tuned a small language model on your laptop. But have you ever wondered why training or even inferencing GPT or Claude requires tens of thousands of specialized chips aka GPUs instead of ...

JGJay Galaingalacodes.hashnode.dev·Nov 5, 2025 · 15 min read

Speculative Decoding: From Theory to Implementation

Let's talk about speculative decoding. One of the most elegant optimization techniques in modern LLM inference. If you've ever wondered how to squeeze 2-3x more throughput from your language models without sacrificing output quality, you're in the ri...

JGJay Galaingalacodes.hashnode.dev·May 30, 2025 · 6 min read

Bits Don't Lie: Datatypes in Modern LLMs

Let’s talk about the different datatypes that are being used in modern LLMs like GPT, LLaMa and the like. The most common ones that you might have heard: FP32, BFloat16, Float16, INT8, etc. These are all standard data types and available in PyTorch a...

JGJay Galaingalacodes.hashnode.dev·Oct 26, 2024 · 7 min read

Introduction to LLM inferencing

Unless you’re living under a rock, you’ve probably heard of Large Language Models (LLMs) and even used a few of the popular applications like ChatGPT, Claude, Perplexity, etc. powered by these LLMs. So without going too deep into what LLMs are, let’s...