apurvak3.hashnode.devFlashAttention: Making Transformers Faster and More Memory-EfficientLarge Language Models (LLMs) like GPT, BERT, and modern Transformers rely heavily on the self-attention mechanism. While powerful, self-attention is also the biggest performance bottleneck when working with long sequences. In 2022, Tri Dao and collab...Dec 26, 2025·5 min read
apurvak3.hashnode.dev🧠 Transformers Explained Simply: From Word2Vec to Multi-Head Attention# A deep dive into the paper “Attention Is All You Need” and how modern NLP models like BERT and GPT evolved from it. 🌟 Introduction If you’ve ever wondered how models like BERT, GPT, or T5 understand language, the answer lies in one architectur...Nov 2, 2025·5 min read
apurvak3.hashnode.dev# 🧠 Neural Machine Translation of Rare Words with Subword Units —By Apurva KanthPublished on Hashnode 💭 The Problem: When Neural Networks Don’t Know a Word Traditional Neural Machine Translation (NMT) systems work with a fixed vocabulary — often limited to around 30,000–50,000 words. But languages are full of ra...Oct 29, 2025·4 min read
apurvak3.hashnode.dev🧬 C2S-Scale 27B: When AI Learns the Language of Cells🧬 C2S-Scale 27B: When AI Learns the Language of Cells Working in the AI and automation space, I’ve seen how scaling models transforms their reasoning capabilities. But applying those same scaling principles to biological systems was something I neve...Oct 17, 2025·4 min read
apurvak3.hashnode.dev🎬 Building a Context-Based Movie Recommendation System using Cosine Similarity🧠 Introduction Have you ever wondered how Netflix or Prime Video knows exactly what you might like next?In this blog, I’ll walk you through how I built a content-based movie recommendation system using Python, Pandas, and Cosine Similarity — from da...Oct 12, 2025·3 min read