Multi-Head, Multi-Query, and Grouped-Query Attention: Which One Should You Use?
In this blog, we will discuss different types of attention mechanisms. First, we will discuss the intro about Multi Head and Multi Query attentions and their limitations. Then we will discuss Group Query Attention(GQA) and why it is needed and we wil...
sisirdhakal.hashnode.dev6 min read