Counting tokens at scale using tiktoken
Tiktoken is one of the most popular tokenizers out there. This is a really nice and simple cookbook that shows how to use it.
Recently I was optimizing our token counting function that is often used to chunk data to send to embedding models(need prec...
dsdev.in2 min read