Building Your LLM Testing Suite
Your new RAG-based chatbot works perfectly on the five questions you've tested. The demo went great. You're feeling good. But what happens when a user asks about something completely out-of-domain? Or tries a subtle prompt injection to make it say so...
ivandimov.dev6 min read