RLM CLI Skill | Agent Skills

RLM CLI

Recursive Language Models (RLM) CLI - enables LLMs to handle near-infinite context by recursively decomposing inputs and calling themselves over parts. Supports files, directories, URLs, and stdin.

Installation

pip install rlm-cli    # or: pipx install rlm-cli
uvx rlm-cli ask ...    # run without installing

Set an API key for your backend (openrouter is default):

export OPENROUTER_API_KEY=...  # default backend
export OPENAI_API_KEY=...      # for --backend openai
export ANTHROPIC_API_KEY=...   # for --backend anthropic

Commands

ask - Query with context

rlm ask <inputs> -q "question"

Inputs (combinable): | Type | Example | Notes | |------|---------|-------| | Directory | rlm ask . -q "..." | Recursive, respects .gitignore | | File | rlm ask main.py -q "..." | Single file | | URL | rlm ask https://x.com -q "..." | Auto-converts to markdown | | stdin | git diff \| rlm ask - -q "..." | - reads from pipe | | Literal | rlm ask "text" -q "..." --literal | Treat as raw text | | Multiple | rlm ask a.py b.py -q "..." | Combine any types |

Options: | Flag | Description | |------|-------------| | -q "..." | Question/prompt (required) | | --backend | Provider: openrouter (default), openai, anthropic | | --model NAME | Model override (format: provider/model or just model) | | --json | Machine-readable output | | --output-format | Output format: text, json, or json-tree | | --summary | Show execution summary with depth statistics | | --extensions .py .ts | Filter by extension | | --include/--exclude | Glob patterns | | --max-iterations N | Limit REPL iterations (default: 30) | | --max-depth N | Recursive RLM depth (default: 1 = no recursion) | | --max-budget N.NN | Spending limit in USD (requires OpenRouter) | | --max-timeout N | Time limit in seconds | | --max-tokens N | Total token limit (input + output) | | --max-errors N | Consecutive error limit before stopping | | --no-index | Skip auto-indexing | | --exa | Enable Exa web search (requires EXA_API_KEY) | | --inject-file FILE | Execute Python code between iterations |

JSON output structure:

{"ok": true, "exit_code": 0, "result": {"response": "..."}, "stats": {...}}

JSON-tree output (--output-format=json-tree): Adds execution tree showing nested RLM calls:

{
  "result": {
    "response": "...",
    "tree": {
      "depth": 0,
      "model": "openai/gpt-4",
      "duration": 2.3,
      "cost": 0.05,
      "iterations": [...],
      "children": [...]
    }
  }
}

Summary output (--summary): Shows depth-wise statistics after completion:

JSON mode: adds summary field to stats
Text mode: prints summary to stderr

=== RLM Execution Summary ===
Total depth: 2 | Nodes: 3 | Cost: $0.0054 | Duration: 17.38s
Depth 0: 1 call(s) ($0.0047, 13.94s)
Depth 1: 2 call(s) ($0.0007, 3.44s)

complete - Query without context

rlm complete "prompt text"
rlm complete "Generate SQL" --json --backend openai

search - Search indexed files

rlm search "query" [options]

| Flag | Description | |------|-------------| | --limit N | Max results (default: 20) | | --language python | Filter by language | | --paths-only | Output file paths only | | --json | JSON output |

Auto-indexes on first use. Manual index: rlm index .

index - Build search index

rlm index .              # Index current dir
rlm index ./src --force  # Force full reindex

doctor - Check setup

rlm doctor       # Check config, API keys, deps
rlm doctor --json

Workflows

Git diff review:

git diff | rlm ask - -q "Review for bugs"
git diff --cached | rlm ask - -q "Ready to commit?"
git diff HEAD~3 | rlm ask - -q "Summarize changes"

Codebase analysis:

rlm ask . -q "Explain architecture"
rlm ask src/ -q "How does auth work?" --extensions .py

Search + analyze:

rlm search "database" --paths-only
rlm ask src/db.py -q "How is connection pooling done?"

Compare files:

rlm ask old.py new.py -q "What changed?"

Configuration

Precedence: CLI flags > env vars > config file > defaults

Config locations: ./rlm.yaml, ./.rlm.yaml, ~/.config/rlm/config.yaml

backend: openrouter
model: google/gemini-3-flash-preview
max_iterations: 30

Environment variables:

RLM_BACKEND - Default backend
RLM_MODEL - Default model
RLM_CONFIG - Config file path
RLM_JSON=1 - Always output JSON

Recursion and Budget Limits

Recursive RLM (`--max-depth`)

Enable recursive llm_query() calls where child RLMs process sub-tasks:

# 2 levels of recursion
rlm ask . -q "Research thoroughly" --max-depth 2

# With budget cap
rlm ask . -q "Analyze codebase" --max-depth 3 --max-budget 0.50

Budget Control (`--max-budget`)

Limit spending per completion. Raises BudgetExceededError when exceeded:

# Cap at $1.00
rlm complete "Complex task" --max-budget 1.00

# Very low budget (will likely exceed)
rlm ask . -q "Analyze everything" --max-budget 0.001

Requirements: OpenRouter backend (returns cost data in responses).

Other Limits

Timeout (--max-timeout) - Stop after N seconds:

rlm complete "Complex task" --max-timeout 30

Token limit (--max-tokens) - Stop after N total tokens:

rlm ask . -q "Analyze" --max-tokens 10000

Error threshold (--max-errors) - Stop after N consecutive code errors:

rlm complete "Write code" --max-errors 3

Stop Conditions

RLM execution stops when any of these occur:

Final answer - LLM calls FINAL_VAR("variable_name") with the NAME of a variable (as a string)
Max iterations - Exceeds --max-iterations (exit code 0, graceful - forces final answer)

FINAL_VAR usage (common mistake - pass variable NAME, not value):

# CORRECT:
result = {"answer": "hello", "score": 42}
FINAL_VAR("result")  # pass the variable NAME as a string

# WRONG:
FINAL_VAR(result)  # passing the dict directly causes AttributeError

Max budget exceeded - Spending > --max-budget (exit code 20, error)
Max timeout exceeded - Time > --max-timeout (exit code 20, error with partial answer)
Max tokens exceeded - Tokens > --max-tokens (exit code 20, error with partial answer)
Max errors exceeded - Consecutive errors > --max-errors (exit code 20, error with partial answer)
User cancellation - Ctrl+C or SIGUSR1 (exit code 0, returns partial answer as success)
Max depth reached - Child RLM at depth 0 cannot recurse further

Note on max iterations: This is a soft limit. When exceeded, RLM prompts the LLM one more time to provide a final answer. Modern LLMs typically complete in 1-2 iterations.

Partial answers: When timeout, tokens, or errors stop execution, the error includes partial_answer if any response was generated before stopping.

Early exit (Ctrl+C): Pressing Ctrl+C (or sending SIGUSR1) returns the partial answer as success (exit code 0) with early_exit: true in the result.

Inject File (--inject-file)

Update REPL variables mid-run by modifying an inject file:

# Create inject file
echo 'focus = "authentication"' > inject.py

# Run with inject file
rlm ask . -q "Analyze based on 'focus'" --inject-file inject.py

# In another terminal, update mid-run
echo 'focus = "authorization"' > inject.py

The file is checked before each iteration and executed if modified.

Exit Codes

| Code | Meaning | |------|---------| | 0 | Success | | 2 | CLI usage error | | 10 | Input error (file not found) | | 11 | Config error (missing API key) | | 20 | Backend/API error (includes budget exceeded) | | 30 | Runtime error | | 40 | Index/search error |

LLM Search Tools

When rlm ask runs on a directory, the LLM gets search tools:

| Tool | Cost | Privacy | Use For | |------|------|---------|---------| | rg.search() | Free | Local | Exact patterns, function names, imports | | tv.search() | Free | Local | Topics, concepts, related files | | exa.search() | $ | API | Web search (requires --exa flag) | | pi.* | $$$ | API | Hierarchical PDF/document navigation |

Free Local Tools (auto-loaded)

rg.search(pattern, paths, globs) - ripgrep for exact patterns
tv.search(query, limit) - Tantivy BM25 for concepts

Exa Web Search (--exa flag, Costs Money)

⚠️ Opt-in: Requires --exa flag and EXA_API_KEY environment variable.

Setup:

export EXA_API_KEY=...  # Get from https://exa.ai

Usage in REPL:

from rlm_cli.tools_search import exa, web

# Basic search
results = exa.search(query="Python async patterns", limit=5)
for r in results:
    print(f"{r['title']}: {r['url']}")

# With highlights (relevant excerpts)
results = exa.search(
    query="error handling best practices",
    limit=3,
    include_highlights=True
)

# Semantic alias
results = web(query="machine learning tutorial", limit=5)

# Find similar pages
results = exa.find_similar(url="https://example.com/article", limit=5)

exa.search() parameters: | Param | Default | Description | |-------|---------|-------------| | query | required | Search query | | limit | 10 | Max results | | search_type | "auto" | "auto", "neural", or "keyword" | | include_domains | None | Only these domains | | exclude_domains | None | Exclude these domains | | include_text | False | Include full page text | | include_highlights | True | Include relevant excerpts | | category | None | "company", "research paper", "news", etc. |

When to use exa.search() / web():

Finding external documentation, tutorials, articles
Researching topics beyond the local codebase
Finding similar pages to a reference URL

PageIndex (pi.* - Opt-in, Costs Money)

⚠️ WARNING: PageIndex sends document content to LLM APIs and costs money.

Only use when:

User explicitly requests document/PDF analysis
Document has hierarchical structure (reports, manuals)
User accepts cost/privacy tradeoffs

Prerequisites:

OPENROUTER_API_KEY (or other backend key) must be set in environment
PageIndex submodule must be initialized
Run within rlm-cli's virtual environment (has required dependencies)

Setup (REQUIRED before any pi. operation):*

import sys
sys.path.insert(0, "/path/to/rlm-cli/rlm")        # rlm submodule
sys.path.insert(0, "/path/to/rlm-cli/pageindex")  # pageindex submodule

from rlm.clients import get_client
from rlm_cli.tools_pageindex import pi

# Configure with existing rlm backend
client = get_client(backend="openrouter", backend_kwargs={"model_name": "google/gemini-2.0-flash-001"})
pi.configure(client)

Indexing (costs $$$):

# Build tree index - THIS COSTS MONEY (no caching, re-indexes each call)
tree = pi.index(path="report.pdf")
# Returns: PITree object with doc_name, nodes, doc_description, raw

Viewing structure (free after indexing):

# Display table of contents
print(pi.toc(tree))

# Get section by node_id (IDs are "0000", "0001", "0002", etc.)
section = pi.get_section(tree, "0003")
# Returns: PINode with title, node_id, start_index, end_index, summary, children
# Returns: None if not found

if section:
    print(f"{section.title}: pages {section.start_index}-{section.end_index}")

Finding node IDs: Node IDs are assigned sequentially ("0000", "0001", ...) in tree traversal order. To see all node IDs, access the raw tree structure:

import json
print(json.dumps(tree.raw["structure"], indent=2))
# Each node has: title, node_id, start_index, end_index

pi. API Reference:* | Method | Cost | Returns | Description | |--------|------|---------|-------------| | pi.configure(client) | Free | None | Set rlm backend (REQUIRED first) | | pi.status() | Free | dict | Check availability, config, warning | | pi.index(path=str) | $$$ | PITree | Build tree from PDF | | pi.toc(tree, max_depth=3) | Free | str | Formatted table of contents | | pi.get_section(tree, node_id) | Free | PINode or None | Get section by ID | | pi.available() | Free | bool | Check if PageIndex installed | | pi.configured() | Free | bool | Check if client configured |

PITree attributes: doc_name, nodes (list of PINode), doc_description, raw (dict) PINode attributes: title, node_id, start_index, end_index, summary (may be None), children (may be None)

Notes:

summary is only populated if add_summaries=True in pi.index()
children is None for leaf nodes (sections with no subsections)
tree.raw["structure"] is a flat list; hierarchy is in PINode.children
PageIndex extracts document structure (TOC), not content. Use page numbers to locate sections in the original PDF.

Example output from pi.toc():

📄 annual_report.pdf

• Executive Summary (p.1-5)
• Financial Overview (p.6-20)
  • Revenue (p.6-10)
  • Expenses (p.11-15)
  • Projections (p.16-20)
• Risk Factors (p.21-35)

Agent Skills: RLM CLI

Install this agent skill to your local

Skill Files

RLM CLI

Installation

Commands

ask - Query with context

complete - Query without context

search - Search indexed files

index - Build search index

doctor - Check setup

Workflows

Configuration

Recursion and Budget Limits

Recursive RLM (`--max-depth`)

Budget Control (`--max-budget`)

Other Limits

Stop Conditions

Inject File (--inject-file)

Exit Codes

LLM Search Tools

Free Local Tools (auto-loaded)

Exa Web Search (--exa flag, Costs Money)

PageIndex (pi.* - Opt-in, Costs Money)

Agent Skills: RLM CLI

Install this agent skill to your local

Skill Files

RLM CLI

Installation

Commands

ask - Query with context

complete - Query without context

search - Search indexed files

index - Build search index

doctor - Check setup

Workflows

Configuration

Recursion and Budget Limits

Recursive RLM (--max-depth)

Budget Control (--max-budget)

Other Limits

Stop Conditions

Inject File (--inject-file)

Exit Codes

LLM Search Tools

Free Local Tools (auto-loaded)

Exa Web Search (--exa flag, Costs Money)

PageIndex (pi.* - Opt-in, Costs Money)

Recursive RLM (`--max-depth`)

Budget Control (`--max-budget`)