Identity + Governance = 100% Safety? Testing Combined Persona Approaches on Abliterated LLMs
3d ago · 6 min read · In our previous experiment, we showed that persona-level behavioral rules (Soul Spec) barely help when an LLM's safety training has been surgically removed: +6pp refusal improvement on abliterated models versus +33pp on aligned ones. The conclusion f...
Join discussion














