👋 Welcome to Repello Guard Docs
Repello Guard is a real-time firewall for LLM models and applications. Drop it in front of your LLM call - or behind it - and get instant checks to prevent injections, jailbreaks, data leaks and more.
Why we built Repello Guard
- LLMs hallucinate & leak
Modern LLM models can be jail-broken, reveal hidden prompts, or emit hateful text. Shipping without a guardrail is a brand-risk waiting to happen.
- General safety APIs are black boxes
Most “moderation” services return a single boolean flag. Repello Guard exposes fine-grained violation types & calibrated risk levels so you can auto-block, soft-block, rewrite, or just log.
- Latency matters
End-users bounce when responses crawl. Our P99 latency is < 100 ms from us-east-1, under heavy load.
- Vendor freedom
Works with OpenAI®, Anthropic®, Google Gemini®, Mistral®, local GGUF
models, or anything you spin up tomorrow.
Key capabilities at a glance
| Capability | What it catches | Sample violation enums |
| Jailbreak / prompt-injection detection | ”Ignore previous instructions …” attempts | PROMPT_INJECTION,UNSAFE_PROMPT |
| Toxicity & hate speech, Bias & stereotyping | Harassment, Slurs, Threats, Demographic Prejudice, Political Bias | TOXIC_PROMPT,UNSAFE_PROMPT |
| Competitor veto | Mentions or defamation of specified brands | COMPETITOR_MENTION |
| Banned topics | Anything you blacklist (e.g. self-harm, medical) | BANNED_TOPICS |
| System-prompt leakage | Semantic overlap score 0-1, checks for system prompt leak in the response | SYSTEM_PROMPT_LEAK |
| Policy Check | Violations of set policies and guidelines | POLICY_VIOLATION |
How it fits in your stack
Use the API as a proxy to your LLM calls. Pass the input prompts through Repello Guard to detect any of the above violations, pass the outputs from the LLMs to prevent leaks, moderate content and prevent brand reputation damage.
Contact us for an on-prem deployment in your Virtual Private Cloud.
Supported languages & scripts
Repello Guard accepts any UTF-8 string and identifies 100+ natural languages (Latin, Cyrillic, CJK, RTL, etc.).
| Metric | SLA / Typical |
| P99 latency | < 100 ms (us-east-1) |
| Throughput | 3 k req/s per tenant (burstable) |
| Uptime | 99.9 % rolling 30 day |