Strong argument. We capped our public MCP at 5 tools after testing — every additional tool roughly doubled the wrong-tool-selected rate in our internal eval, even when descriptions were tight and disjoint. Your "tools competing for the same slot in the model's attention" framing matches our instinct exactly. Did you find the regression worse with overlapping verbs (search/lookup/find) or with overlapping nouns (signals/data/metrics)?