how does replacing the original values that were sanitized from PII plugin work?
From what I know, presidio only replaces original value back if we tell it what all places it needs to do the replacement.
do you have a wrapper to get response from LLM and put the original values back, by keeping a map of values that were sanitized.
Also, the custom pattern match has fields: Name, Regex, Score
but does not have context field.. without context, the score is irrelevant as regex matches are binary. and for the check box fields of Built-in categories do we have a score defined as some default values ? do these filters have some threshold score..?
Great questions, On the round-trip / mapping: you're right that Presidio won't restore on its own. Kong is the wrapper. When it sanitizes the request, it builds a little map of original to redacted value for each entity and keeps it in per-request context (OpenResty's ngx.ctx). On the way back it just does a string replace to swap the originals back in, then throws the map away. It lives for exactly one request. Nothing cached or persisted.
On score being irrelevant without context: there's genuinely no context field, you're right. But score still does something. It's not a firing gate (regex matches or it doesn't), it's a tiebreaker. When two recognizers flag the same span, highest score wins. Built-in Presidio recognizers often sit around 0.85, and the custom default is 0.5, so a custom pattern can lose to a built-in on an overlapping span. If yours "isn't taking effect," bump the score. Just don't treat it as a sensitivity dial. Lower doesn't catch more, higher doesn't suppress weak matches.
On built-in categories having a threshold: they're a fixed list, not individually score-configurable from the plugin. Any actual score_threshold cutoff lives inside the PII service (Presidio's AnalyzerEngine), not the Kong config, and isn't documented. The plugin applies no cutoff of its own, it just forwards your patterns and scores straight through.