© 2026 Hashnode
Pre-training uses massive datasets and computational resources—often thousands of GPUs running for weeks or months—making it a domain dominated by top AI companies. Post-training is much lighter in cost and time (often days instead of months) and foc...

Reinforcement Learning with Human Feedback(RLHF) is a technique combined with Reinforcement learning and human feedback to better align the LLMs with human preferences. This blog covers most of the concepts that i learnt. Before diving deep, let's un...
