Transformers and Attention Mechanisms: From Basics to GPTMini
Transformers have revolutionized natural language processing by using attention mechanisms to model long-range dependencies. In this post, we’ll journey from the origins of attention to building a mini GPT model (“GPTMini”) from scratch. We’ll start ...
blog.abhinavtb.com21 min read