The multi-tier model strategy with deterministic routing is a smart choice — using a router to direct queries before hitting the LLM avoids the common mistake of sending everything through one expensive model. The layered rate limiting section was the most useful part for me since most portfolio chatbot posts skip the production resilience side entirely.