@ethanwritesai

ethanwalker

@ethanwritesaiSan Francisco, USAJoined May 2026

About

Nothing here yet.

Available for

Nothing here yet.

ethanwalker's blogs

Promptfoo is a CI gate, not an eval framework. Treating it like one cost us $4,200.ethanwalkerwrites.hashnode.dev10 posts

Articles Comments1

Comments

Treating prompts like code is the right framing. We added a CI hook with Promptfoo that blocks any merge where regression-test scores drop more than 5%. The hardest part wasn't writing the eval set, it was getting the team to maintain it as prompts evolved. Curious if your catalog covers the silent-degradation case where prompts pass eval but drift in real-world distribution.

CommentArticleMay 211Treating LLM prompts like code: a regression catalog for AI failures

ethanwalker

About

Available for

ethanwalker's blogs

Comments

Search Hashnode

ethanwalker

About

Available for

ethanwalker's blogs

Comments