Vision Yatra: Step 14 โ Masked Multi-Head Self-Attention: How Transformers Generate Text Sequentially
Hey everyone! ๐Iโm Pankaj, and welcome to Vision Yatra: Step 14, where we unlock the most critical innovation in Transformer decoders:
๐ Masked Multi-Head Self-Attention
In the last post, we saw how the Decoder generates text one token at a time ...
my-ai-yatra.solveautomation.in5 min read