Agent Skills: CM - CASS Memory System

CASS Memory System - procedural memory for AI coding agents. Three-layer cognitive architecture with confidence decay, anti-pattern learning, cross-agent knowledge transfer, trauma guard safety system. Bun/TypeScript CLI.

UncategorizedID: Dicklesworthstone/agent_flywheel_clawdbot_skills_and_integrations/cm

Skill Files

Browse the full folder contents for cm.

Download Skill

Loading file tree…

skills/cm/SKILL.md

Skill Metadata

Name
cm
Description
"CASS Memory System - procedural memory for AI coding agents. Three-layer cognitive architecture with confidence decay, anti-pattern learning, cross-agent knowledge transfer, trauma guard safety system. Bun/TypeScript CLI."

CM - CASS Memory System

Procedural memory for AI coding agents. Transforms scattered sessions into persistent, cross-agent memory. Uses a three-layer cognitive architecture that mirrors human expertise development.

Why This Exists

AI coding agents accumulate valuable knowledge but it's:

  • Trapped in sessions - Context lost when session ends
  • Agent-specific - Claude doesn't know what Cursor learned
  • Unstructured - Raw logs aren't actionable guidance
  • Subject to collapse - Naive summarization loses critical details

You've solved auth bugs three times this month across different agents. Each time you started from scratch.

CM solves this with cross-agent learning: a pattern discovered in Cursor is immediately available to Claude Code.


Three-Layer Cognitive Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                    EPISODIC MEMORY (cass)                           │
│   Raw session logs from all agents — the "ground truth"             │
│   Claude Code │ Codex │ Cursor │ Aider │ PI │ Gemini │ ChatGPT │ ...│
└───────────────────────────┬─────────────────────────────────────────┘
                            │ cass search
                            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    WORKING MEMORY (Diary)                           │
│   Structured session summaries: accomplishments, decisions, etc.    │
└───────────────────────────┬─────────────────────────────────────────┘
                            │ reflect + curate (automated)
                            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    PROCEDURAL MEMORY (Playbook)                     │
│   Distilled rules with confidence tracking and decay                │
└─────────────────────────────────────────────────────────────────────┘

Every agent's sessions feed the shared memory. A pattern discovered in Cursor automatically helps Claude Code on the next session.


The One Command You Need

cm context "<your task>" --json

Run this before starting any non-trivial task. Returns:

  • relevantBullets - Rules from playbook scored by task relevance
  • antiPatterns - Things that have caused problems
  • historySnippets - Past sessions (yours and other agents')
  • suggestedCassQueries - Deeper investigation searches

Filtering History by Source

historySnippets[].origin.kind is "local" or "remote". Remote hits include origin.host:

{
  "historySnippets": [
    {
      "source_path": "~/.claude/sessions/session-001.jsonl",
      "origin": { "kind": "local" }
    },
    {
      "source_path": "/home/user/.codex/sessions/session.jsonl",
      "origin": { "kind": "remote", "host": "workstation" }
    }
  ]
}

Confidence Decay System

Rules aren't immortal. Confidence decays without revalidation:

| Mechanism | Effect | |-----------|--------| | 90-day half-life | Confidence halves every 90 days without feedback | | 4x harmful multiplier | One mistake counts 4× as much as one success | | Maturity progression | candidateestablishedproven |

Score Decay Visualization

Initial score: 10.0 (10 helpful marks today)

After 90 days (half-life):   5.0
After 180 days:              2.5
After 270 days:              1.25
After 365 days:              0.78

Effective Score Formula

effectiveScore = decayedHelpful - (4 × decayedHarmful)

// Where decay factor = 0.5 ^ (daysSinceFeedback / 90)

Maturity State Machine

  ┌──────────┐       ┌─────────────┐    ┌────────┐
  │ candidate│──────▶│ established │───▶│ proven │
  └──────────┘       └─────────────┘    └────────┘
       │                   │                  │
       │                   │ (harmful >25%)   │
       │                   ▼                  │
       │             ┌─────────────┐          │
       └────────────▶│ deprecated  │◀─────────┘
                     └─────────────┘

Transition Rules:

| Transition | Criteria | |------------|----------| | candidateestablished | 3+ helpful, harmful ratio <25% | | establishedproven | 10+ helpful, harmful ratio <10% | | anydeprecated | Harmful ratio >25% OR explicit deprecation |


Anti-Pattern Learning

Bad rules don't just get deleted. They become warnings:

"Cache auth tokens for performance"
    ↓ (3 harmful marks)
"PITFALL: Don't cache auth tokens without expiry validation"

When a rule is marked harmful multiple times (>50% harmful ratio with 3+ marks), it's automatically inverted into an anti-pattern.


ACE Pipeline (How Rules Are Created)

Generator → Reflector → Validator → Curator

| Stage | Role | LLM? | |-------|------|------| | Generator | Pre-task context hydration (cm context) | No | | Reflector | Extract patterns from sessions (cm reflect) | Yes | | Validator | Evidence gate against cass history | Yes | | Curator | Deterministic delta merge | No |

Critical: Curator has NO LLM to prevent context collapse from iterative drift. LLMs propose patterns; deterministic logic manages them.

Scientific Validation

Before a rule joins your playbook, it's validated against cass history:

Proposed rule: "Always check token expiry before auth debugging"
    ↓
Evidence gate: Search cass for sessions where this applied
    ↓
Result: 5 sessions found, 4 successful outcomes → ACCEPT

Rules without historical evidence are flagged as candidates until proven.


Commands Reference

Context Retrieval (Primary Workflow)

# THE MAIN COMMAND - run before non-trivial tasks
cm context "implement user authentication" --json

# Limit results for token budget
cm context "fix bug" --json --limit 5 --no-history

# With workspace filter
cm context "refactor" --json --workspace /path/to/project

# Self-documenting explanation
cm quickstart --json

# System health
cm doctor --json
cm doctor --fix  # Auto-fix issues

# Find similar rules
cm similar "error handling best practices"

Playbook Management

cm playbook list                              # All rules
cm playbook get b-8f3a2c                      # Rule details
cm playbook add "Always run tests first"      # Add rule
cm playbook add --file rules.json             # Batch add from file
cm playbook add --file rules.json --session /path/session.jsonl  # Track source
cm playbook remove b-xyz --reason "Outdated"  # Remove
cm playbook export > backup.yaml              # Export
cm playbook import shared.yaml                # Import
cm playbook bootstrap react                   # Apply starter to existing

cm top 10                                     # Top effective rules
cm stale --days 60                            # Rules without recent feedback
cm why b-8f3a2c                               # Rule provenance
cm stats --json                               # Playbook health metrics

Learning & Feedback

# Manual feedback
cm mark b-8f3a2c --helpful
cm mark b-xyz789 --harmful --reason "Caused regression"
cm undo b-xyz789                              # Revert feedback

# Session outcomes (positional: status, rules)
cm outcome success b-8f3a2c,b-def456
cm outcome failure b-x7k9p1 --summary "Auth approach failed"
cm outcome-apply                              # Apply to playbook

# Reflection (usually automated)
cm reflect --days 7 --json
cm reflect --session /path/to/session.jsonl   # Single session
cm reflect --workspace /path/to/project       # Project-specific

# Validation
cm validate "Always check null before dereferencing"

# Audit sessions against rules
cm audit --days 30

# Deprecate permanently
cm forget b-xyz789 --reason "Superseded by better pattern"

Onboarding (Agent-Native)

Zero-cost playbook building using your existing agent:

cm onboard status                             # Check progress
cm onboard gaps                               # Category gaps
cm onboard sample --fill-gaps                 # Prioritized sessions
cm onboard sample --agent claude --days 14    # Filter by agent/time
cm onboard sample --workspace /path/project   # Filter by workspace
cm onboard sample --include-processed         # Re-analyze sessions
cm onboard read /path/session.jsonl --template  # Rich context
cm onboard mark-done /path/session.jsonl      # Mark processed
cm onboard reset                              # Start fresh

Trauma Guard (Safety System)

cm trauma list                                # Active patterns
cm trauma add "DROP TABLE" --description "Mass deletion" --severity critical
cm trauma heal t-abc --reason "Intentional migration"
cm trauma remove t-abc
cm trauma scan --days 30                      # Scan for traumas
cm trauma import shared-traumas.yaml

cm guard --install                            # Claude Code hook
cm guard --git                                # Git pre-commit hook
cm guard --install --git                      # Both
cm guard --status                             # Check installation

System Commands

cm init                                       # Initialize
cm init --starter typescript                  # With template
cm init --force                               # Reinitialize (creates backup)
cm starters                                   # List templates
cm serve --port 3001                          # MCP server
cm usage                                      # LLM cost stats
cm privacy status                             # Privacy settings
cm privacy enable                             # Enable cross-agent enrichment
cm privacy disable                            # Disable enrichment
cm project --format agents.md                 # Export for AGENTS.md

Starter Playbooks

Starting with an empty playbook is daunting. Starters provide curated best practices:

cm starters                    # List available
cm init --starter typescript   # Initialize with starter
cm playbook bootstrap react    # Apply to existing playbook

Built-in Starters

| Starter | Focus | Rules | |---------|-------|-------| | general | Universal best practices | 5 | | typescript | TypeScript/Node.js patterns | 4 | | react | React/Next.js development | 4 | | python | Python/FastAPI/Django | 4 | | node | Node.js/Express services | 4 | | rust | Rust service patterns | 4 |

Custom Starters

Create YAML files in ~/.cass-memory/starters/:

# ~/.cass-memory/starters/django.yaml
name: django
description: Django web framework best practices
bullets:
  - content: "Always use Django's ORM for database operations"
    category: database
    maturity: established
    tags: [django, orm]

Inline Feedback (During Work)

Leave feedback in code comments. Parsed during reflection:

// [cass: helpful b-8f3a2c] - this rule saved me from a rabbit hole

// [cass: harmful b-x7k9p1] - this advice was wrong for our use case

Agent Protocol

1. START:    cm context "<task>" --json
2. WORK:     Reference rule IDs when following them (e.g., "Following b-8f3a2c...")
3. FEEDBACK: Leave inline comments when rules help/hurt
4. END:      Just finish. Learning happens automatically.

You do NOT need to:

  • Run cm reflect (automation handles this)
  • Run cm mark manually (use inline comments)
  • Manually add rules to the playbook

Gap Analysis Categories

| Category | Keywords | |----------|----------| | debugging | error, fix, bug, trace, stack | | testing | test, mock, assert, expect, jest | | architecture | design, pattern, module, abstraction | | workflow | task, CI/CD, deployment | | documentation | comment, README, API doc | | integration | API, HTTP, JSON, endpoint | | collaboration | review, PR, team | | git | branch, merge, commit | | security | auth, token, encrypt, permission | | performance | optimize, cache, profile |

Category Status Thresholds:

| Status | Rule Count | Priority | |--------|------------|----------| | critical | 0 rules | High | | underrepresented | 1-2 rules | Medium | | adequate | 3-10 rules | Low | | well-covered | 11+ rules | None |


Trauma Guard: Safety System

The "hot stove" principle—learn from past incidents and prevent recurrence.

How It Works

Session History              Trauma Registry              Runtime Guard
┌─────────────────┐         ┌─────────────────┐         ┌─────────────────┐
│ rm -rf /* (oops)│ ──────▶ │ Pattern: rm -rf │ ──────▶ │ BLOCKED: This   │
│ "sorry, I made  │  scan   │ Severity: FATAL │  hook   │ command matches │
│  a mistake..."  │         │ Session: abc123 │         │ a trauma pattern│
└─────────────────┘         └─────────────────┘         └─────────────────┘

Built-in Doom Patterns (20+)

| Category | Examples | |----------|----------| | Filesystem | rm -rf /, rm -rf ~, recursive deletes | | Database | DROP DATABASE, TRUNCATE, DELETE FROM without WHERE | | Git | git push --force to main/master, git reset --hard | | Infrastructure | terraform destroy -auto-approve, kubectl delete namespace | | Cloud | aws s3 rm --recursive, destructive CloudFormation |

Pattern Storage

| Scope | Location | Purpose | |-------|----------|---------| | Global | ~/.cass-memory/traumas.jsonl | Personal patterns | | Project | .cass/traumas.jsonl | Commit to repo for team |

Pattern Lifecycle

  • Active: Blocks matching commands
  • Healed: Temporarily bypassed (with reason and timestamp)
  • Deleted: Removed (can be re-added)

MCP Server

Run as MCP server for agent integration:

# Local-only (recommended)
cm serve --port 3001

# With auth token (for non-loopback)
MCP_HTTP_TOKEN="<random>" cm serve --host 0.0.0.0 --port 3001

Tools Exposed

| Tool | Purpose | Parameters | |------|---------|------------| | cm_context | Get rules + history | task, limit?, history?, days?, workspace? | | cm_feedback | Record feedback | bulletId, helpful?, harmful?, reason? | | cm_outcome | Record session outcome | sessionId, outcome, rulesUsed? | | memory_search | Search playbook/cass | query, scope?, limit?, days? | | memory_reflect | Trigger reflection | days?, maxSessions?, dryRun? |

Resources Exposed

| URI | Purpose | |-----|---------| | cm://playbook | Current playbook state | | cm://diary | Recent diary entries | | cm://outcomes | Session outcomes | | cm://stats | Playbook health metrics |

Client Configuration

Claude Code (~/.config/claude/mcp.json):

{
  "mcpServers": {
    "cm": {
      "command": "cm",
      "args": ["serve"]
    }
  }
}

Graceful Degradation

| Condition | Behavior | |-----------|----------| | No cass | Playbook-only scoring, no history snippets | | No playbook | Empty playbook, commands still work | | No LLM | Deterministic reflection, no semantic enhancement | | Offline | Cached playbook + local diary |


Output Format

All commands support --json for machine-readable output.

Design principle: stdout = JSON only; diagnostics go to stderr.

Success Response

{
  "success": true,
  "task": "fix the auth timeout bug",
  "relevantBullets": [
    {
      "id": "b-8f3a2c",
      "content": "Always check token expiry before auth debugging",
      "effectiveScore": 8.5,
      "maturity": "proven",
      "relevanceScore": 0.92,
      "reasoning": "Extracted from 5 successful sessions"
    }
  ],
  "antiPatterns": [...],
  "historySnippets": [...],
  "suggestedCassQueries": [...],
  "degraded": null
}

Error Response

{
  "success": false,
  "code": "PLAYBOOK_NOT_FOUND",
  "error": "Playbook file not found",
  "hint": "Run 'cm init' to create a new playbook",
  "retryable": false,
  "recovery": ["cm init", "cm doctor --fix"],
  "docs": "README.md#-troubleshooting"
}

Exit Codes

| Code | Meaning | |------|---------| | 1 | Internal error | | 2 | User input/usage | | 3 | Configuration | | 4 | Filesystem | | 5 | Network | | 6 | cass error | | 7 | LLM/provider error |


Token Budget Management

| Flag | Effect | |------|--------| | --limit N | Cap number of rules | | --min-score N | Only rules above threshold | | --no-history | Skip historical snippets (faster) | | --json | Structured output |


Configuration

Config lives at ~/.cass-memory/config.json (global) and .cass/config.json (repo).

Precedence: CLI flags > Repo config > Global config > Defaults

Security: Repo config cannot override sensitive paths or user-level consent settings.

Key Options

{
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "budget": {
    "dailyLimit": 0.10,
    "monthlyLimit": 2.00
  },
  "scoring": {
    "decayHalfLifeDays": 90,
    "harmfulMultiplier": 4
  },
  "maxBulletsInContext": 50,
  "maxHistoryInContext": 10,
  "sessionLookbackDays": 7,
  "crossAgent": {
    "enabled": false,
    "consentGiven": false,
    "auditLog": true
  },
  "remoteCass": {
    "enabled": false,
    "hosts": [{"host": "workstation", "label": "work"}]
  },
  "semanticSearchEnabled": false,
  "embeddingModel": "Xenova/all-MiniLM-L6-v2",
  "dedupSimilarityThreshold": 0.85
}

Environment Variables

| Variable | Purpose | |----------|---------| | ANTHROPIC_API_KEY | API key for Anthropic (Claude) | | OPENAI_API_KEY | API key for OpenAI | | GOOGLE_GENERATIVE_AI_API_KEY | API key for Google Gemini | | CASS_PATH | Path to cass binary | | CASS_MEMORY_LLM | Set to none for LLM-free mode | | MCP_HTTP_TOKEN | Auth token for non-loopback MCP server |


Data Locations

~/.cass-memory/                  # Global (user-level)
├── config.json                  # Configuration
├── playbook.yaml                # Personal playbook
├── diary/                       # Session summaries
├── outcomes/                    # Session outcomes
├── traumas.jsonl                # Trauma patterns
├── starters/                    # Custom starter playbooks
├── onboarding-state.json        # Onboarding progress
├── privacy-audit.jsonl          # Cross-agent audit trail
├── processed-sessions.jsonl     # Reflection progress
└── usage.jsonl                  # LLM cost tracking

.cass/                           # Project-level (in repo)
├── config.json                  # Project-specific overrides
├── playbook.yaml                # Project-specific rules
├── traumas.jsonl                # Project-specific patterns
└── blocked.yaml                 # Anti-patterns to block

Automating Reflection

Cron Job

# Daily at 2am
0 2 * * * /usr/local/bin/cm reflect --days 7 >> ~/.cass-memory/reflect.log 2>&1

Claude Code Hook

.claude/hooks.json:

{
  "post-session": ["cm reflect --days 1"]
}

Privacy & Security

Local-First Design

  • All data stays on your machine
  • No cloud sync, no telemetry
  • Cross-agent enrichment is opt-in with explicit consent
  • Audit log for enrichment events

Secret Sanitization

Before processing, content is sanitized:

  • OpenAI/Anthropic/AWS/Google API keys
  • GitHub tokens
  • JWTs
  • Passwords and secrets in config patterns

Privacy Controls

cm privacy status    # Check settings
cm privacy enable    # Enable cross-agent enrichment
cm privacy disable   # Disable enrichment

Performance Characteristics

| Operation | Typical Latency | |-----------|-----------------| | cm context (cached) | 50-150ms | | cm context (cold) | 200-500ms | | cm context (no cass) | 30-80ms | | cm reflect (1 session) | 5-15s | | cm reflect (5 sessions) | 20-60s | | cm playbook list | <50ms | | cm similar (keyword) | 20-50ms | | cm similar (semantic) | 100-300ms |

LLM Cost Estimates

| Operation | Typical Cost | |-----------|--------------| | Reflect (1 session) | $0.01-0.05 | | Reflect (7 days) | $0.05-0.20 | | Validate (1 rule) | $0.005-0.01 |

With default budget ($0.10/day, $2.00/month): ~5-10 sessions/day.


Batch Rule Addition

After analyzing a session, add multiple rules at once:

# Create JSON file
cat > rules.json << 'EOF'
[
  {"content": "Always run tests before committing", "category": "testing"},
  {"content": "Check token expiry before auth debugging", "category": "debugging"},
  {"content": "AVOID: Mocking entire modules in tests", "category": "testing"}
]
EOF

# Add all rules
cm playbook add --file rules.json

# Track which session they came from
cm playbook add --file rules.json --session /path/to/session.jsonl

# Or pipe from stdin
echo '[{"content": "Rule", "category": "workflow"}]' | cm playbook add --file -

Template Output for Onboarding

--template provides rich context for rule extraction:

cm onboard read /path/to/session.jsonl --template --json

Returns:

  • metadata: path, workspace, message count, topic hints
  • context: related rules, playbook gaps, suggested focus
  • extractionFormat: schema, categories, examples
  • sessionContent: actual session data

Integration with CASS

CASS provides episodic memory (raw sessions). CM extracts procedural memory (rules and playbooks).

# CASS: Search raw sessions
cass search "authentication timeout" --robot

# CM: Get distilled rules for a task
cm context "authentication timeout" --json

Troubleshooting

| Error | Solution | |-------|----------| | cass not found | Install from cass repo | | cass search failed | Run cass index --full | | API key missing | Set ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_GENERATIVE_AI_API_KEY | | Playbook corrupt | Run cm doctor --fix | | Budget exceeded | Check cm usage, adjust limits |

Diagnostic Commands

cm doctor --json           # System health
cm doctor --fix            # Auto-fix issues
cm usage                   # LLM budget status
cm stats --json            # Playbook health
cm why <bullet-id>         # Rule provenance

LLM-Free Mode

CASS_MEMORY_LLM=none cm context "task" --json

Installation

# One-liner (recommended)
curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/cass_memory_system/main/install.sh \
  | bash -s -- --easy-mode --verify

# Specific version
install.sh --version v0.2.2 --verify

# System-wide
install.sh --system --verify

# From source
git clone https://github.com/Dicklesworthstone/cass_memory_system.git
cd cass_memory_system
bun install && bun run build
sudo mv ./dist/cass-memory /usr/local/bin/cm

Integration with Flywheel

| Tool | Integration | |------|-------------| | CASS | CM reads from cass episodic memory, writes procedural memory | | NTM | Robot mode integrates with cm for context before agent work | | Agent Mail | Rules can reference mail threads as provenance | | BV | Task context enriched with relevant playbook rules |