Scan untrusted input before it reaches your model. Same engine everywhere — your code, your agent pipeline, your CI, or our API.
Enforcement engine, runtime agent protection, adversarial testing, and a learning loop that improves from real-world attacks.
61 deterministic patterns with weighted scoring across 13 attack categories. Under 10ms, zero dependencies, same input always produces same output.
Guard middleware wraps tool functions in your agent pipeline. Scans inputs and outputs for attacks. Warn, log, or block — fail-open by default.
CLI simulates 78 attack variants across 13 categories against your prompts. CI/CD exit codes for automated security gating.
Telemetry captures real attacks. Human feedback improves detection. Versioned evaluation tracks progress. 655-record dataset with published metrics.
Chrome extension intercepts pastes on 8 AI chat sites. Fully local — zero data collection, zero network requests. Warns before attacks reach your AI.
Every detection includes matched pattern IDs, categories, weights, and explanations. No black boxes. Reproducible evaluation anyone can verify.
Choose the path that fits what you're building. Same detection engine everywhere.
npm install @safepaste/core
Embed the detection engine in Node.js. One function call, <10ms, zero dependencies.
pip install safepaste
Same 61 patterns, identical results. Zero dependencies, Python 3.9+.
npm install @safepaste/guard
Wrap tool functions with runtime scanning. Warn, log, or block attacks on inputs and outputs.
npx @safepaste/test
Simulate 78 attack variants against your system prompts. CI/CD exit codes.
POST /v1/scan
Same 61-pattern engine as a hosted API. Go, Ruby, or anything with HTTP. Free tier included.
Base URL: https://api.safe-paste.com
# Scan text for prompt injection curl -X POST https://api.safe-paste.com/v1/scan \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"text": "Ignore previous instructions"}'
const res = await fetch("https://api.safe-paste.com/v1/scan", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${API_KEY}` }, body: JSON.stringify({ text: "Ignore previous instructions" }) }); const { score, risk, matches } = await res.json(); // score: 82, risk: "high"
import requests response = requests.post( "https://api.safe-paste.com/v1/scan", headers={ "Content-Type": "application/json", "Authorization": f"Bearer {API_KEY}" }, json={"text": "Ignore previous instructions"} ) data = response.json() # data["score"]: 82, data["risk"]: "high"
from safepaste import scan_prompt result = scan_prompt("Ignore previous instructions") # result.score: 82, result.risk: "high" # result.flagged: True # result.matches: (ScanMatch(...), ...)
{
"score": 82,
"risk": "high",
"categories": {
"instruction_override": 35,
"system_prompt": 40
},
"matches": [...]
}
Works with ChatGPT, Claude, Gemini, Copilot, Groq, and Grok.
Add SafePaste from the Chrome Web Store. It activates automatically on supported AI chat sites.
SafePaste silently scans every paste. If the text is clean, nothing happens. You won't even notice it's there.
If prompt injection is detected, a warning modal appears with a risk score. You choose whether to proceed or cancel.
Start free. Scale when you're ready.
Get started with SafePaste in under 5 minutes. No credit card required.