I've been shipping RAG + prompt engineering for most of my LLM work and it's been fine. But everyone keeps saying "yeah you really need fine-tuning for production" and I genuinely don't get the tradeoff. Fine-tuning costs money, takes time, requires good labeled data. Prompting is basically free iteration. When does the math actually flip. What am I missing here.
No responses yet.