PEPedro Eugeniointheweeklyprompt.news·Apr 27 · 4 min readThe Private LanguageTwo papers dropped this week that fit together like diagnosis and experiment. One counts what's broken. The other tries to fix it in a way nobody expected. Start with the numbers. A new study analyzed token consumption across eight frontier models on...10
PEPedro Eugeniointheweeklyprompt.news·Apr 17 · 3 min readThe One-in-Three ProblemThe demos look great. The videos are impressive. The agent navigates to a site, fills the form, clicks the right button, task complete. That is a real thing. It happens. Then a new benchmark drops, measures 153 everyday tasks across 144 live websites...00
PEPedro Eugeniointheweeklyprompt.news·Apr 17 · 4 min readThe Reasoning CeilingTwo things happened in AI research this week, and they point in opposite directions. Inference got meaningfully faster. And several papers made it clearer than ever exactly where reasoning models break, no matter how fast you run them. Start with the...00
PEPedro Eugeniointheweeklyprompt.news·Apr 17 · 5 min readThe MCP Token TaxYou connect an agent to three MCP servers, GitHub, Slack, Sentry. Feel like you've built something solid. Then someone counts the actual token spend before the agent does anything at all. The number is 143,000. Out of 200,000. On tool schemas that ha...00
PEPedro Eugeniointheweeklyprompt.news·Apr 17 · 4 min readAgents Teaching AgentsEvery AI agent system you've seen has the same invisible problem. The skills are frozen. From the moment you deploy, the way your agent handles a complex workflow, the tool-call sequences it knows, the failure modes it avoids, all of it is locked in ...00