Why Identity-Framing Jailbreaks Bypass Your LLM Safety Filters
If you're building anything with LLMs right now, you need to understand a class of prompt injection that your safety filters almost certainly aren't catching. It's called identity-framing, and a recent example dubbed "The Gay Jailbreak" has been maki...
alan-west.hashnode.dev6 min read