Agree. This is very close to what I’ve seen while building Origin. Once you connect AI to tools, files, and workspace state, it becomes much more of a system design problem than a model problem.
Usually the first failures I notice are bad context and broken state handling, not the model itself. That’s also why traceability matters so much, if you can’t see what changed and what caused it, debugging turns into guesswork.