Comment by Archit Mittal on "How I Built a Citation Parser for AI Engine Responses"

Great question, you're right that it's the hardest edge case.

This article covers the extraction side (structured citation data out of inconsistent formats). The detection/verification side is a separate system I've been building on top of it. Short version: I use a regex-first, LLM-second hybrid for determining whether a specific source was actually cited. Regex catches ~73% of verbatim citations with <2% false positives, and an LLM classifier handles the paraphrased references regex misses. The hybrid gets to ~96% true positive rate at ~4% false positive.

On fabricated citations specifically, the research is sobering. Maheshwari et al. (2025) at Amazon found that generative search engines only achieve about 74% citation accuracy. The Princeton ALCE benchmark found that even the best models lack complete citation support 50% of the time. So the problem isn't hypothetical, it's baseline behavior.

For monitoring, the known-domain matching in Pass 1 sidesteps hallucination by design since you're matching against entities you know are real. Where it bites is in full extraction (Pass 2), where a model can generate plausible URLs to pages that don't exist. I've seen ChatGPT do this with correct domains but fabricated paths.

A proper verification layer (HEAD requests to confirm URLs resolve + content matching to confirm the source supports the claim) is the logical next step but a separate system I haven't shipped yet. That's the gap between "citation extraction" and "citation trust" and I think most people in this space are underestimating it.

Planning to write up the detection/verification layer in more detail soon. Appreciate you flagging the exact problem that makes it necessary.

Search Hashnode