Great breakdown. Totally agree that code generation has outpaced testing.
The “works in demo, breaks in prod” problem is very real, especially with auth and edge cases. Feels like the gap now is less about building and more about validating fast enough.
We’ve started keeping a small set of critical flows tracked in Tuskr just to make sure nothing essential breaks as we iterate quickly. Not perfect, but it helps bring some structure back.