teaching a gpt to judge itself, part one
Sep 30, 2025 · 3 min read · last time, i laid out the theory: toy datasets, self-critique inspiration, and the idea of tracing evaluations. this week, i actually built it. the mvp version of reasonsaver is a pipeline that takes prompts, runs completions, scores them, and saves ...
Join discussion