Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

Discussion on "Your LLM Judge Has Opinions. They're Not About Quality。" | Hashnode

FeedDiscussion

Frank Chen

AI observability and evals

Apr 28

Your LLM Judge Has Opinions. They're Not About Quality。

When your eval score goes up, the natural conclusion is that your model got better. But there's another explanation: your LLM judge has systematic biases, and your latest change happened to produce ou

respan.hashnode.dev13 min read

#ai #llm #ai-development #ai-eval #ai-evaluation

Responses

No responses yet.