LLM Deep Dive — Part 2 Post Training
This is a series based on Andrej Karpathy’s “Deep Dive into LLMs like ChatGPT.” Part 1 of the series is here.
A base model can be thought of as a token-level simulator of the internet. At its core, it does not reason or plan — it simply predicts the ...
moderndataarchitect.hashnode.dev8 min read