How I Built a BPE Tokenizer from Scratch (Part 1: Training)
Okay so before we get into the code, let me tell you why I even did this.
I could've just used a library. tiktoken exists. HuggingFace tokenizers exist. But I wanted to actually get it — like really u
bpe-tokenizer.hashnode.dev5 min read