Discussion

mayaandersson

Just a bored curious dev

Jun 4

Calibration set size for LLM-as-judge: when 50 traces is enough and when 200 is mandatory

TL;DR. The human-labeled calibration set you use to validate an LLM-as-judge does not need a fixed size. It needs a size that depends on how balanced your labels are. For roughly balanced binary crite

llmasajudge.hashnode.dev11 min read

#llm #ai #programming-blogs #developer

Responses

No responses yet.

Search Hashnode

Calibration set size for LLM-as-judge: when 50 traces is enough and when 200 is mandatory

Responses

Recent in Forum