promptexploit

i'm feeling ★ adversarial ★

$ cat about.md

breaking models so defenses get boring

promptexploit is a field notebook for LLM security: prompt injection, jailbreak research, agent security, evals, and defensive patterns that move trust boundaries back into code.

$ whoami --long

This blog treats every model integration as a security system. The writing is technical, practical, and biased toward controls that still work when a model reads hostile text.

research notes

Short writeups on injection classes, jailbreak testing, and how attacks show up in real agent workflows.

defense patterns

Isolation, provenance, least privilege, schema gates, review loops, and eval harnesses for regression testing.

threat model

Assume untrusted content can contain instructions, and assume the model can be persuaded at least once.

contact

Responsible disclosure and collaboration notes can go to hello@promptexploit.com.