How Positional Encoding & Multi-Head Attention Powers Transformers?
Remember those jumbled sentences from school that you had to unscramble? You’d have a set of words in random order and your task would be to rearrange them to make a meaningful sentence. Now, imagine if you had to do that with an entire book. This is...

