Agent-Skills.md

Agent Skills: Prompt Injection Detector Skill

Prompt injection detection and prevention for secure LLM applications

UncategorizedID: a5c-ai/babysitter/prompt-injection-detector

Author

a5c-ai

https://github.com/a5c-ai View all skills

Repository

a5c-ai/babysitter

a5c-aiLicense: MIT

244

Install this agent skill to your local

pnpm dlx add-skill https://github.com/a5c-ai/babysitter/tree/HEAD/library/specializations/ai-agents-conversational/skills/prompt-injection-detector

Skill Files

Browse the full folder contents for prompt-injection-detector.

Loading file tree…

library/specializations/ai-agents-conversational/skills/prompt-injection-detector/SKILL.md

Skill Metadata

Name: prompt-injection-detector
Description: Prompt injection detection and prevention for secure LLM applications

Prompt Injection Detector Skill

Capabilities

Detect prompt injection attempts
Implement input sanitization
Configure detection classifiers
Design defense layers
Implement canary token detection
Create injection logging and alerting

Target Processes

prompt-injection-defense
tool-safety-validation

Implementation Details

Detection Methods

Pattern Matching: Known injection patterns
ML Classifiers: Trained injection detectors
Canary Tokens: Detect instruction override
LLM-Based: Use LLM to detect manipulation
Perplexity Analysis: Unusual input patterns

Defense Strategies

Input preprocessing
Prompt structure design
Output validation
Sandboxed execution
Multi-layer defense

Configuration Options

Detection threshold
Pattern rules
Classifier model
Action policies
Alerting settings

Best Practices

Defense in depth
Regular pattern updates
Monitor false positives
Test with red-team inputs

Dependencies

rebuff (optional)
transformers
Custom classifiers