From Bigram to MiniGPT: My First Transformer Language Model
Feb 16 · 3 min read · Introduction When I started learning about LLMs, terms like self-attention and Transformers felt almost magical. So instead of only using libraries, I decided to build a language model step by step — starting from a very simple baseline and ending wi...
Join discussion