Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

FeedDiscussion

Rahul Sehrawat

Apr 13

Golden Sets, LLM-as-Judge, Human Review: Which Grading Style to Reach For

You have an eval set. You know your rubric. You want a pass rate. Now a new question appears: who decides whether each individual output is a pass? You, reading them one by one? A rule engine that checks for specific strings? Another model that grade...

ai-zero-to-hero.hashnode.dev18 min read

#ai #evaluation #grading #pm #product

Responses

No responses yet.