MCP Server Research Skill

MCP Server Research

Guide for discovering, profiling, and evaluating MCP servers using the local SQLite+FTS5 registry cache and three specialized agents.

When to Use This Skill

Finding MCP servers for a specific domain (e.g., "code analysis", "database management")
Profiling an MCP server to understand its tools, install method, and quality
Comparing multiple servers to recommend the best fit
Seeding or enriching the local registry cache
Running the /find-mcp-servers slash command

Architecture

┌─────────────────────┐
│  /find-mcp-servers  │  ← Slash command (entry point)
└────────┬────────────┘
         │
         ▼
┌─────────────────────┐     ┌──────────────────────┐
│ plugin-mcp-researcher│────▶│ SQLite+FTS5 Cache    │
│ (orchestrator)       │     │ .data/mcp/registry-  │
└────────┬────────────┘     │ cache.db             │
         │                   └──────────────────────┘
    ┌────┴────┐
    ▼         ▼
┌────────┐ ┌─────────────┐
│Scanner │ │  Profiler    │
│(haiku) │ │  (sonnet)    │
└────────┘ └─────────────┘

Components

| Component | Type | Model | Purpose | |-----------|------|-------|---------| | plugin-mcp-researcher | agent | haiku | Cache-first orchestrator — queries FTS, dispatches scanner/profiler | | mcp-registry-scanner | agent | haiku | Lightweight discovery — finds NEW servers across remote registries | | mcp-server-profiler | agent | sonnet | Deep enrichment — fetches README, extracts tools, updates cache | | /find-mcp-servers | command | — | User-facing slash command for server discovery |

Storage Layer

MCP server data lives in the unified knowledge graph:

.data/mcp/knowledge-graph.db   ← SQLite + sqlite-vec (gitignored)
.data/mcp/knowledge-graph.sql  ← SQL dump (version controlled)

Tables:

| Table | Purpose | |-------|---------| | entities | Core records with entity_type = 'mcp_server' | | mcp_servers_ext | MCP-specific fields (install, repo, transport, etc.) | | mcp_server_tools | Tools exposed by each server | | mcp_server_deps | Dependencies required by each server | | mcp_server_assessments | Quality/relevance assessments per server | | v_mcp_servers | Unified view joining entities + mcp_servers_ext |

Management commands:

just mcp-stats          # Show server/registry counts
just mcp-search "query" # Search servers by name/description
just mcp-list           # List top servers by stars
just mcp-show <slug>    # Show server details
just mcp-tools <slug>   # Show server's tools
just kg-dump            # Dump entire knowledge graph

Workflow: Discovering Servers

Step 1: Query Local Cache

Always check the cache first. Use FTS5 or LIKE queries on the knowledge graph:

sqlite3 -json .data/mcp/knowledge-graph.db "
  SELECT e.id, e.name, e.slug, e.content as description,
         ext.install_method, ext.install_command, ext.repository, ext.stars,
         json_extract(e.metadata, '$.features') as features
  FROM entities e
  JOIN entities_fts f ON e.id = f.rowid
  LEFT JOIN mcp_servers_ext ext ON e.id = ext.entity_id
  WHERE e.entity_type = 'mcp_server'
    AND entities_fts MATCH '<keyword1> OR <keyword2>'
  ORDER BY rank
  LIMIT 20;
"

Or use the convenience view:

sqlite3 -json .data/mcp/knowledge-graph.db "
  SELECT * FROM v_mcp_servers
  WHERE name LIKE '%<keyword>%' OR content LIKE '%<keyword>%'
  ORDER BY stars DESC NULLS LAST
  LIMIT 20;
"

Step 2: Evaluate Coverage

Count enriched matches (those with description AND features populated):

>= 3 enriched: Sufficient — skip to ranking
< 3 enriched: Insufficient — proceed to remote discovery

Step 3: Remote Discovery (if needed)

Spawn mcp-registry-scanner (haiku) via Task tool:

Domain: <keywords>
Plugin: standalone-search

The scanner searches 24+ registries in tiered priority order, deduplicates against the cache, and inserts minimal records for new finds.

Step 4: Deep Profiling (if needed)

For each new discovery (or shallow cache hit missing description/features), spawn mcp-server-profiler (sonnet) via Task tool:

Server: <slug>
Plugin: standalone-search
Need: <original purpose string>

Run up to 5 profilers in parallel. Each enriches the cache with:

Full description and feature tags
Install method and command
Repository URL and stars
Language and transport protocol
Tools exposed (inserted into mcp_server_tools)
Dependencies (inserted into mcp_server_deps)

Step 5: Rank and Present

Score matches using weighted criteria:

| Criterion | Weight | Description | |-----------|--------|-------------| | Feature relevance | 40% | How well do features match the stated purpose | | Maintenance | 25% | Stars, last_updated recency, active development | | Install ease | 20% | brew/npx > pip > docker > manual | | Tool coverage | 15% | Number and relevance of MCP tools exposed |

Workflow: Profiling a Single Server

When you need to deeply research one specific server:

Check if it exists in cache: sqlite3 .data/mcp/knowledge-graph.db "SELECT * FROM mcp_servers WHERE slug='<slug>';"
If not cached, insert a minimal record first
Spawn mcp-server-profiler with the slug
The profiler will:
- Fetch the repository README (via gh api or WebSearch)
- Extract metadata: description, features, install method, language, transport
- Identify tools from README documentation or package manifests
- Check quality signals: stars, forks, last commit date, open issues
- UPDATE the cache record and INSERT tool/dep records

Workflow: Seeding from YAML Config

When bulk-loading servers from settings/mcp/*.yaml:

# Read category entries from YAML
# For each entry, INSERT OR IGNORE into mcp_servers with:
#   - slug (normalized from name)
#   - source_registry (from YAML source field)
#   - source_url (from YAML url field)
# Then dump knowledge graph
just kg-dump

Registry Reference

See reference/registries.yaml for the full list of 24+ MCP server registries organized by tier.

Tier 1 (always search)

smithery.ai — Curated registry with install commands
registry.modelcontextprotocol.io — Official MCP registry
glama.ai — Detailed server profiles
pulsemcp.com — Community registry
mcp.so — Search-focused directory
GitHub topic search (gh search repos --topic mcp-server)

Tier 2 (search on cache miss)

mcpservers.org, mcpdb.org, mcp-get.com, opentools.com, cursor.directory, lobehub.com

Tier 3 (search if Tier 2 insufficient)

himcp.ai, mcpmarket.com, portkey.ai, cline.bot, apitracker.io, and others

Web Scraping for Profiling

The profiler agent needs to fetch web content (READMEs, registry pages) and convert to markdown. Available methods in priority order:

Use this 9-tier fallback chain in order:

1. gh api (preferred for GitHub repos)

gh api repos/<owner>/<repo>/readme --jq '.content' | base64 -d

2. crawl4ai-mcp

If the crawl4ai MCP server is connected, use it for JS-rendered pages.

3. trafilatura

trafilatura -u <url>

Clean text extraction CLI. Works well for static pages and documentation sites.

4. WebSearch

Use site:<domain> <server-name> queries to find registry pages. Results include summaries with key metadata.

5. WebFetch

Fetches URL content and converts HTML to markdown. Works for static pages. May be auto-denied in background subagents.

6. Jina Reader

curl -sL "https://r.jina.ai/<url>"

Free tier API for converting web pages to markdown.

7. firecrawl

firecrawl_scrape with formats: ["markdown"]. Handles JS-rendered pages. Use when credits are available.

8. markdownify

curl -sL <url> | python3 -c "import sys; from markdownify import markdownify; print(markdownify(sys.stdin.read()))"

9. html2text

curl -sL <url> | html2text

Last resort — basic HTML-to-text conversion.

Common Patterns

Inserting a new server

-- First insert into entities
INSERT INTO entities (entity_type, slug, name, content, metadata)
VALUES ('mcp_server', '<slug>', '<name>', '<description>',
        json_object('features', '<comma,separated,tags>'));

-- Then insert into mcp_servers_ext
INSERT INTO mcp_servers_ext (entity_id, source_registry, source_url, discovered_at)
SELECT id, '<registry>', '<url>', datetime('now')
FROM entities WHERE slug = '<slug>' AND entity_type = 'mcp_server';

Updating after profiling

-- Update entity content
UPDATE entities SET
  content = '<description>',
  metadata = json_set(metadata, '$.features', '<comma,separated,tags>'),
  updated_at = datetime('now')
WHERE slug = '<slug>' AND entity_type = 'mcp_server';

-- Update extension fields
UPDATE mcp_servers_ext SET
  install_method = '<brew|npx|pip|docker|manual>',
  install_command = '<command>',
  repository = '<url>',
  language = '<lang>',
  stars = <N>,
  last_updated = '<ISO date>',
  refreshed_at = datetime('now')
WHERE entity_id = (SELECT id FROM entities WHERE slug = '<slug>' AND entity_type = 'mcp_server');

Inserting tools

INSERT INTO mcp_server_tools (server_id, name, description)
SELECT id, '<tool_name>', '<tool_description>'
FROM entities WHERE slug = '<slug>' AND entity_type = 'mcp_server';

Troubleshooting

| Problem | Cause | Fix | |---------|-------|-----| | FTS returns no results | Keywords too specific or DB empty | Use broader terms, check just mcp-stats | | Profiler can't fetch README | WebFetch/firecrawl denied in subagent | Fall back to gh api or WebSearch | | Firecrawl credits exhausted | API quota hit | Use gh api, WebSearch, or CLI fallbacks | | Duplicate slugs on insert | Server already exists | Use INSERT OR IGNORE or check before inserting | | DB locked errors | Concurrent writes from parallel agents | Run profilers sequentially or use WAL mode | | Changes not persisted | Forgot to dump after changes | Run just kg-dump |

Checklist

[ ] Knowledge graph initialized (just kg-init)
[ ] FTS/LIKE query built from purpose keywords
[ ] Cache checked before any remote calls
[ ] Scanner spawned only on cache miss
[ ] Profilers run in parallel (max 5)
[ ] Knowledge graph dumped after modifications (just kg-dump)
[ ] Results ranked by weighted criteria
[ ] Tools fetched for top results

References

MCP Specification
Awesome MCP Servers
Registry list: reference/registries.yaml
Agent definitions: content/agents/mcp-registry-scanner.md, content/agents/mcp-server-profiler.md, content/agents/plugin-mcp-researcher.md
Command: content/commands/find-mcp-servers.md

Agent Skills: MCP Server Research

Install this agent skill to your local

Skill Files