promptexploit
$ cat about.md
breaking models so defenses get boring
promptexploit is a field notebook for LLM security: prompt injection, jailbreak research, agent security, evals, and defensive patterns that move trust boundaries back into code.
This blog treats every model integration as a security system. The writing is technical, practical, and biased toward controls that still work when a model reads hostile text.
Short writeups on injection classes, jailbreak testing, and how attacks show up in real agent workflows.
Isolation, provenance, least privilege, schema gates, review loops, and eval harnesses for regression testing.
Assume untrusted content can contain instructions, and assume the model can be persuaded at least once.
Responsible disclosure and collaboration notes can go to hello@promptexploit.com.