The chatbot leaking internal emails example captures the core tension: AI systems have wider attack surfaces than traditional software because their behavior is shaped by training data, not just code.
Three threat categories worth highlighting:
Indirect prompt injection via retrieval - RAG pipelines are attack vectors if untrusted documents enter the knowledge base. Models do not distinguish between user instruction and retrieved context at inference time.
Model weight tampering - For local deployments, model files ARE code. Supply chain attacks that modify weights create backdoors that survive code review.
Training data extraction - Attackers do not need training data access if they can craft inputs that cause the model to regurgitate training examples.
The article focuses on inference-time threats, but the attack surface spans training, fine-tuning, and deployment. Threat modeling for AI needs the full pipeline, not just the API boundary.
The chatbot leaking internal emails example captures the core tension: AI systems have wider attack surfaces than traditional software because their behavior is shaped by training data, not just code.
Three threat categories worth highlighting:
Indirect prompt injection via retrieval - RAG pipelines are attack vectors if untrusted documents enter the knowledge base. Models do not distinguish between user instruction and retrieved context at inference time.
Model weight tampering - For local deployments, model files ARE code. Supply chain attacks that modify weights create backdoors that survive code review.
Training data extraction - Attackers do not need training data access if they can craft inputs that cause the model to regurgitate training examples.
The article focuses on inference-time threats, but the attack surface spans training, fine-tuning, and deployment. Threat modeling for AI needs the full pipeline, not just the API boundary.