#ai-yatra articles

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 5 min read

Vision Yatra: Step 10 — Layer Normalization & Residual Connections: Stabilizing the Transformer

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 10, where we dive into two critical components that make Transformers train deep, stable, and fast: ✅ Layer Normalization✅ Residual Connections After mastering: Self-Attention Multi-Head...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 6 min read

Vision Yatra: Step 16 — The Final Layer: Linear & Softmax — How Transformers Generate Words

Hey everyone! 👋I'm Pankaj, and welcome to Vision Yatra: Step 16 — the final step in our journey through the Transformer architecture. After 15 deep dives into: Self-Attention & Multi-Head Attention Positional Encoding Layer Normalization & Residu...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 5 min read

Vision Yatra: Step 13 — The Transformer Decoder: How Models Generate Text One Token at a Time

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 13, where we unlock the Transformer Decoder — the engine behind text generation in ChatGPT, GPT-4, and every major LLM. In the last post, we mastered the Encoder — how Transformers underst...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 6 min read

Vision Yatra: Step 12 — The Transformer Encoder: Full Architecture Breakdown

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 12, where we assemble all the pieces into the full Transformer Encoder — the engine behind BERT, GPT, and every major LLM. In the last posts, we've mastered: Self-Attention & Multi-Head A...

1

A

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 6 min read

Vision Yatra: Step 15 — Encoder-Decoder Attention: How Transformers "Translate" Between Languages

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 15, where we unlock the final piece of the Transformer puzzle: 🔑 Encoder-Decoder Attention In the last posts, we mastered: Masked Multi-Head Self-Attention (how decoders generate text ...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 5 min read

Vision Yatra: Step 14 — Masked Multi-Head Self-Attention: How Transformers Generate Text Sequentially

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 14, where we unlock the most critical innovation in Transformer decoders: 🔒 Masked Multi-Head Self-Attention In the last post, we saw how the Decoder generates text one token at a time ...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 4 min read

Vision Yatra: Step 11 — Layer Normalization in Action: Step-by-Step Math & Example

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 11, where we run the numbers on Layer Normalization — the stabilizing force behind Transformers. In the last post, we saw: What Layer Norm is How it differs from Batch Norm Why Transfor...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 5 min read

Vision Yatra: Step 9 — Positional Encoding: How Transformers Know Word Order

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 9, where we solve one of the biggest mysteries in Transformers: ❓ "If Transformers process all words at once… how do they know the order?" We’ve seen how: Transformers use Self-Attentio...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 5 min read

Vision Yatra: Step 8 — Multi-Head Attention to Feed-Forward: How Outputs Are Combined

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 8, where we complete the Multi-Head Attention puzzle — and see how Transformers combine multiple perspectives into one powerful vector. In the last post, we saw how Multi-Head Attention le...

0

PKPankaj Kumarmy-ai-yatra.solveautomation.inAug 22, 2025 · 4 min read

Vision Yatra: Step 7 — Multi-Head Attention: How Transformers Learn Multiple Perspectives

Hey everyone! 👋I’m Pankaj, and welcome to Vision Yatra: Step 7, where we dive into Multi-Head Attention — the secret behind Transformers’ deep understanding of language. In the last post, we saw how Self-Attention creates contextual vectors by letti...

0

#ai-yatra

#ai-yatra

Explore Hashnode

Trending tags this week

Vision Yatra: Step 10 — Layer Normalization & Residual Connections: Stabilizing the Transformer

Vision Yatra: Step 16 — The Final Layer: Linear & Softmax — How Transformers Generate Words

Vision Yatra: Step 13 — The Transformer Decoder: How Models Generate Text One Token at a Time

Vision Yatra: Step 12 — The Transformer Encoder: Full Architecture Breakdown

Vision Yatra: Step 15 — Encoder-Decoder Attention: How Transformers "Translate" Between Languages

Vision Yatra: Step 14 — Masked Multi-Head Self-Attention: How Transformers Generate Text Sequentially

Vision Yatra: Step 11 — Layer Normalization in Action: Step-by-Step Math & Example

Vision Yatra: Step 9 — Positional Encoding: How Transformers Know Word Order

Vision Yatra: Step 8 — Multi-Head Attention to Feed-Forward: How Outputs Are Combined

Vision Yatra: Step 7 — Multi-Head Attention: How Transformers Learn Multiple Perspectives

#ai-yatra

Search Hashnode

#ai-yatra

Explore Hashnode

Trending tags this week

Vision Yatra: Step 10 — Layer Normalization & Residual Connections: Stabilizing the Transformer

Vision Yatra: Step 16 — The Final Layer: Linear & Softmax — How Transformers Generate Words

Vision Yatra: Step 13 — The Transformer Decoder: How Models Generate Text One Token at a Time

Vision Yatra: Step 12 — The Transformer Encoder: Full Architecture Breakdown

Vision Yatra: Step 15 — Encoder-Decoder Attention: How Transformers "Translate" Between Languages

Vision Yatra: Step 14 — Masked Multi-Head Self-Attention: How Transformers Generate Text Sequentially

Vision Yatra: Step 11 — Layer Normalization in Action: Step-by-Step Math & Example

Vision Yatra: Step 9 — Positional Encoding: How Transformers Know Word Order

Vision Yatra: Step 8 — Multi-Head Attention to Feed-Forward: How Outputs Are Combined

Vision Yatra: Step 7 — Multi-Head Attention: How Transformers Learn Multiple Perspectives