Auto-Recall Skill | Agent Skills

Auto-Recall

Overview

Semantic retrieval from the perpetual memory vector store at query time. Parses query intent, searches the perpetual_memory LanceDB table, ranks results by relevance and recency, and returns structured context for injection into agent prompts.

Core principle: Every agent should be able to recall any past interaction instantly by meaning, not by filename or keyword. Auto-recall is the read side of the perpetual memory architecture.

When to Use

At the start of any task to recall related past decisions and learnings
When debugging to find previously encountered similar issues
When an agent needs context about how a similar problem was solved before
When building prompts that benefit from historical context
When checking if a pattern or approach was tried previously

Do NOT use for:

Searching code (use pnpm search:code instead)
Searching markdown memory files (use memory-search.cjs for that)
Real-time interaction monitoring (use perpetual-memory skill for writes)

Relationship to Existing Tools

| Tool | Searches | Use For | | ------------------- | -------------------------- | ------------------------------------ | | pnpm search:code | Code files (BM25+semantic) | Finding code patterns | | memory-search.cjs | Markdown memory files | Searching learnings/decisions/issues | | auto-recall | perpetual_memory table | Recalling past interactions |

Auto-recall is complementary -- it searches a different index (the perpetual memory vector store) that contains auto-captured interaction summaries rather than manually-written markdown entries.

Workflow

Step 1: Parse Query Intent

Before searching, classify the query intent:

| Intent Type | Query Pattern | Search Strategy | | ----------- | ----------------------------------------- | ------------------------- | | Decision | "Why did we choose X", "What was decided" | Filter: category=decision | | Issue | "Have we seen this error before" | Filter: category=issue | | Pattern | "How do we usually handle X" | Filter: category=pattern | | Learning | "What did we learn about X" | Filter: category=learning | | General | Any other query | No category filter |

Step 2: Search Vector Store

# Basic semantic search
node .claude/tools/cli/auto-embed.cjs --query "how does the routing guard handle Write operations" --limit 10

# Or via the skill script
node .claude/skills/auto-recall/scripts/main.cjs --query "JWT refresh token pattern" --limit 5

Step 3: Rank by Relevance + Recency

Results are ranked by cosine similarity from LanceDB. For time-sensitive queries, apply a recency boost:

final_score = similarity * 0.7 + recency_score * 0.3

where recency_score = max(0, 1 - (days_since_creation / 30))

This ensures recent interactions are slightly preferred when similarity is close.

Step 4: Inject Context

Format retrieved memories for agent prompt injection:

## Recalled Context (from perpetual memory)

1. [decision] (sim=0.87, 2d ago, agent=architect)
   Chose JWT RS256 over HS256 for key rotation support. ADR-045.

2. [learning] (sim=0.82, 5d ago, agent=developer)
   Token refresh requires httpOnly cookies to prevent XSS.

3. [issue] (sim=0.74, 1d ago, agent=qa)
   JWT expiry not propagated to frontend. Workaround in auth.middleware.ts:47.

CLI Reference

# Semantic query
node .claude/skills/auto-recall/scripts/main.cjs --query "routing guard behavior"

# Query with category filter
node .claude/skills/auto-recall/scripts/main.cjs --query "auth decision" --category decision

# Query with limit
node .claude/skills/auto-recall/scripts/main.cjs --query "memory system" --limit 5

# Query with recency boost
node .claude/skills/auto-recall/scripts/main.cjs --query "recent changes" --recency-boost

# Output as JSON
node .claude/skills/auto-recall/scripts/main.cjs --query "search" --json

Agent Integration Pattern

Agents should invoke auto-recall at the start of significant tasks:

// At task start, recall relevant context
Skill({ skill: 'auto-recall' });

// Then query for task-relevant history
// node .claude/skills/auto-recall/scripts/main.cjs --query "<task description>" --limit 5

This provides agents with historical context about similar past work, preventing repeated mistakes and leveraging prior decisions.

Iron Laws

NEVER use auto-recall as a replacement for memory-search.cjs -- they search different indexes and are complementary.
ALWAYS limit results to avoid context bloat -- default to 5-10 results, never more than 20.
NEVER inject recalled context without relevance filtering -- results below 0.5 similarity are noise.
ALWAYS include metadata (category, agent, timestamp) in recalled context for traceability.
NEVER block on auto-recall failure -- if the perpetual memory table is unavailable, proceed without it.

Anti-Patterns

| Anti-Pattern | Why It Fails | Correct Approach | | ---------------------------------------- | ------------------------------------------------ | ---------------------------------------------- | | Using auto-recall for code search | Wrong index; code is in BM25/semantic code index | Use pnpm search:code for code discovery | | Injecting all results into context | Low-similarity results pollute the prompt | Filter to similarity > 0.5 before injecting | | Blocking task on recall failure | Perpetual memory is a bonus, not a dependency | Gracefully degrade: proceed without recall | | Recalling without specifying limit | Unbounded results consume too many tokens | Always set --limit (default: 10) | | Trusting recall over fresh code analysis | Past context may be outdated | Use recall as starting context, verify current |

Memory Protocol (MANDATORY)

Before starting: Read .claude/context/memory/learnings.md

After completing:

New pattern -> .claude/context/memory/learnings.md
Issue found -> .claude/context/memory/issues.md
Decision made -> .claude/context/memory/decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

Agent Skills: Auto-Recall

Install this agent skill to your local

Skill Files