Discussion

Mickey Hu

Mar 22

Why Judge Calibration Matters: Sonnet vs Opus — a Case Study

Why Judge Calibration Matters: Sonnet vs Opus — a Case Study I ran an experiment comparing two automatic judges I rely on for model evaluation: Sonnet and Opus. I needed a quick, repeatable way to rank generated outputs. I learned a few things the ha...

mickeynovels.hashnode.dev4 min read

#ai #programming

Responses

No responses yet.

Search Hashnode

Why Judge Calibration Matters: Sonnet vs Opus — a Case Study

Responses

Recent in Forum