Discussion

nullpointerette

Aug 25, 2025

teaching a gpt to judge itself, part zero

lately, i’ve been obsessed with how companies are evaluating their large language models. openai has evals, which is “a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.” anthropic has written about "constituti...

nullpointerette.dev3 min read

Responses

No responses yet.

Search Hashnode

teaching a gpt to judge itself, part zero

Responses

Recent in Forum