promptexploit
i'm feeling
★ adversarial ★
SYS.NAME · promptexploit
SYS.AUTH · GUEST_OK
SYS.NODE · promptexploit.com
UPTIME · 142 days
TERMINAL · TTY0
STATUS · 200 OK
$ whoami
Breaking and defending LLMs. Notes on prompt injection, jailbreak research, and agent security — mostly so the defenses get better.
$ ls -la /posts/
-
PI
indirect-prompt-injection-101.mdhow untrusted content smuggles instructions into a model
-
DEF
isolating-tool-output-in-agents.mdpatterns that stop agents from trusting their inputs blindly
-
RT
building-a-jailbreak-eval-harness.mdmeasuring model robustness with a repeatable test suite