@rajarshidattapy

🧵 LLM Inference Optimization — a short thread

Jan 14 · 2 min read · LLM inference looks deceptively simple—run a forward pass, generate tokens, repeat—but at scale it quickly turns into a systems problem dominated by memory, scheduling, and latency rather than raw compute. Metrics like Time-to-First-Token (TTFT), int...

2025 — Ending on a High note!

Dec 25, 2025 · 5 min read · If I’m being honest, 2025 was a full-blown rollercoaster. So much happened that half the time I genuinely forget what I did and then randomly remember something and go like — oh wait, I did that too? bruh. But yeah, it’s ending on a good note. That’s...

🛡️ FinPal — The AI We Built to Protect People From Hidden Loan Traps & UPI Scams

Dec 7, 2025 · 4 min read · Not everyone checks their phone notifications — but scammers definitely do. We kept seeing people around us fooled by: “Send ₹1 to get refund” messages “0% EMI” loans that quietly hide processing fees + penalties Confusing RBI cybersecurity rules ...

Adversarial Linear Bandits — The Battle Where You Only See One Number

Nov 19, 2025 · 3 min read · If you hang around theory halls or ML research boards (like the CSA department at IISc), you’ll inevitably see “Adversarial Linear Bandit” scribbled somewhere. It sounds intimidating, but the core idea is brutally simple: You choose a vector.The env...

Why Retrieval Beats Reinvention in 2025

Nov 11, 2025 · 2 min read · Fine-tuning an LLM sounds powerful — “let’s train it on our data and make it an expert.”But here’s the catch: in real-world products, RAG (Retrieval-Augmented Generation) almost always wins. If your goal is reliable, updatable, and explainable AI, re...