Agent Skills: Scholar - Research Paper Sidecar Companion

>

UncategorizedID: infinite-loop-factory/app-factory/oma-scholar

Install this agent skill to your local

pnpm dlx add-skill https://github.com/infinite-loop-factory/app-factory/tree/HEAD/.agents/skills/oma-scholar

Skill Files

Browse the full folder contents for oma-scholar.

Download Skill

Loading file tree…

.agents/skills/oma-scholar/SKILL.md

Skill Metadata

Name
oma-scholar
Description
>

Scholar - Research Paper Sidecar Companion

Scheduling

Goal

Search, fetch, generate, validate, analyze, review, and compare scholarly paper sidecars using the Knows .knows.yaml spec for token-efficient research workflows.

Intent signature

  • User asks for academic literature search, sidecar generation, sidecar validation, paper claims/evidence summary, structural paper comparison, or peer review as sidecar.
  • User references Knows, .knows.yaml, knows.academy, OpenAlex, claims, evidence, relations, or paper sidecars.

When to use

  • Reading research papers token-efficiently via Knows sidecars (~700 tokens for claims-only vs ~10K for full PDF)
  • Generating .knows.yaml sidecars from your own paper drafts, LaTeX, or research notes
  • Validating sidecar structure (rule-based) before sharing
  • Producing peer reviews as sidecars
  • Querying or summarizing existing sidecars
  • Structurally comparing two papers (claims, methods, evidence)
  • Searching/fetching sidecars from knows.academy (~50K papers indexed)

When NOT to use

  • General web search or non-academic content -> use oma-search
  • Translating papers -> use oma-translator
  • PDF parsing only (no sidecar) -> use oma-pdf
  • Submitting sidecars back to knows.academy -> out of scope (host LLM only consumes/produces locally)
  • Full peer-review workflow with editor system -> out of scope

Expected inputs

  • Paper, abstract, draft, LaTeX, research notes, sidecar file, DOI, OpenAlex ID, Knows record ID, or search query
  • Desired mode: generate, validate, review, analyze, compare, or remote fetch
  • Optional strictness, section filter, or CI behavior

Expected outputs

  • .knows.yaml sidecar, review sidecar, lint report, search/fetch result, natural-language analysis, or structural comparison
  • Sidecars conforming to v0.9.0 / paper@1 profile
  • Validation status and warnings before sharing generated sidecars

Dependencies

  • oma scholar CLI subcommands
  • knows.academy public API and OpenAlex fallback
  • resources/sidecar-spec.md, API endpoints, OpenAlex setup, upstream cache, checklist, and execution protocol

Control-flow features

  • Branches by mode, source availability, Knows/OpenAlex coverage, strict vs lenient validation, and fetched section
  • Reads/writes YAML sidecars and may call public APIs
  • Avoids fabrication when source evidence is missing

Structural Flow

Entry

  1. Identify mode and source artifact/query.
  2. Resolve paper identity through Knows or OpenAlex when needed.
  3. Load sidecar spec and mode-specific protocol.

Scenes

  1. PREPARE: Select mode and gather source or remote identifiers.
  2. ACQUIRE: Fetch paper metadata, sidecar sections, or local source text.
  3. REASON: Extract claims, evidence, relations, provenance, or comparison structure.
  4. ACT: Generate, lint, review, analyze, compare, or fetch sidecar data.
  5. VERIFY: Validate schema, enums, IDs, relations, and provenance.
  6. FINALIZE: Return sidecar, report, summary, or comparison with caveats.

Transitions

  • If knows.academy lacks the paper, fall back to OpenAlex metadata/abstract.
  • If generating a sidecar, run lint before sharing.
  • If consuming third-party sidecars with dangling references, use lenient mode when appropriate.
  • If source evidence is absent, omit fields instead of guessing.

Failure and recovery

  • If remote API times out, retry or use OpenAlex fallback.
  • If YAML fails parsing, fix indentation and scalar types.
  • If relation density or orphan statements warn, add supported-by relations when source evidence supports them.

Exit

  • Success: requested sidecar operation completes with validation status.
  • Partial success: missing metadata, fallback source, or validation warnings are explicit.

Logical Operations

Actions

| Action | SSL primitive | Evidence | |--------|---------------|----------| | Select mode | SELECT | Generate/Validate/Review/Analyze/Compare/Remote | | Read paper or sidecar | READ | Source files or YAML | | Request remote data | REQUEST | Knows/OpenAlex APIs | | Infer claims/evidence/relations | INFER | Sidecar generation/analysis | | Write sidecar | WRITE | .knows.yaml outputs | | Validate sidecar | VALIDATE | oma scholar lint | | Report result | NOTIFY | Summary or lint report |

Tools and instruments

  • oma scholar search|resolve|get|lint
  • Knows public API, OpenAlex fallback, sidecar spec, checklist

Canonical command path

oma scholar search "<query>"
oma scholar resolve "<title-or-doi>"
oma scholar get "<record-id-or-doi>"
oma scholar lint "<paper.knows.yaml>"

Resource scope

| Scope | Resource target | |-------|-----------------| | LOCAL_FS | Paper drafts, sidecar YAML, review sidecars | | NETWORK | knows.academy and OpenAlex APIs | | PROCESS | oma scholar CLI and lint | | USER_DATA | User-provided paper content and research notes |

Preconditions

  • Mode and source are identifiable.
  • Spec rules are available for generation or validation.

Effects and side effects

  • May create local sidecar or review sidecar files.
  • May query public scholarly APIs.
  • Does not submit sidecars back to knows.academy.

Guardrails

  1. Target spec is v0.9.0 / paper@1 profile: verified against production sidecars from knows.academy; see resources/sidecar-spec.md
  2. Host LLM generates sidecars: never shell out to anthropic SDK or external LLM CLI; this skill runs inside an agent
  3. Anti-fabrication: if DOI/venue/year is not visible in source, omit the key entirely; never write doi: TODO or guess
  4. Top-level metadata: title, authors, venue, year live at the top level (no metadata wrapper)
  5. Field names are exact: statement_type, evidence_type, predicate, artifact_type (not type/claim)
  6. Provenance has SINGLE actor: provenance.actor is one object, NOT a provenance.actors array
  7. Confidence is an object: {claim_strength: ..., extraction_fidelity: ...}, both from high|medium|low
  8. Coverage is an object: coverage.statements (4-value enum) + coverage.evidence (3-value enum)
  9. Closed enums: actor tool|person|org (never ai/llm/model); artifact role subject|supporting|cited; predicates in present tense
  10. Numbers unquoted: value: 22, never value: '22'
  11. Relation density: average ≥1.5 relations per statement; every claim needs supported_by evidence (lint warns when ratio is below; orphan statements warned per-id)
  12. ID format: descriptive kebab-case with prefix: stmt:privacy-budget-tradeoff, ev:cifar10-accuracy-table, art:paper
  13. Validate before sharing: run oma scholar lint after Generate
  14. Remote API has no auth: https://knows.academy/api/proxy/* is public; do not invent auth headers
  15. Partial fetch param is section (singular): fixed enum statements|evidence|relations|artifacts|citation
  16. OpenAlex key is optional: metadata enrichment only; gracefully degrade when missing
  17. Sidecar content stays English: schema fields, IDs, statement text follow upstream convention; user-facing responses follow oma-config.yaml language
  18. Spec drift awareness: our local rules track v0.9.0 production behavior, which differs from the upstream knows.md natural-language description; refresh resources/upstream-spec-cache.md periodically

Modes

| Mode | Trigger | Output | |------|---------|--------| | Generate | "create sidecar from this paper / abstract / draft", "generate .knows.yaml" | {paper}.knows.yaml (host LLM emits, then oma scholar lint validates) | | Validate | "lint this sidecar", "validate .knows.yaml" | Pass/fail report with file:line issues | | Review | "peer review this paper as sidecar" | {paper}.review.knows.yaml | | Analyze | "summarize this sidecar", "what claims does it make?" | Natural-language answer | | Compare | "compare paper A and paper B structurally" | Diff table (claims/methods/evidence) | | Remote | "find papers on X", "fetch sidecar :id", "get claims only for :id" | Search results / sidecar payload |

Provider Fallback (knows.academy → OpenAlex)

knows.academy currently indexes only 2026 papers (~50K, mostly arXiv). For older or non-2026 papers (Transformer 2017, BERT 2018, classics, journals), the skill automatically falls back to OpenAlex for metadata and abstract.

Use the oma scholar CLI subcommands:

# Hybrid search: knows first, OpenAlex fallback
oma scholar search "vision language action"

# Cross-source resolve: figures out which source has the right paper
oma scholar resolve "Attention Is All You Need"

# Get by id (knows record_id, OpenAlex W-id, or DOI)
oma scholar get "10.48550/arXiv.1706.03762"

When OpenAlex returns the answer (knows.academy lacks the paper), use the returned abstract as input to Mode 1 Generate to produce a local sidecar.

How to Execute

Follow resources/execution-protocol.md step by step for the selected mode.

Quick Reference

Search (knows + auto OpenAlex fallback)

oma scholar search "diffusion super resolution"
oma scholar search --year-min 2024 "vision language action"

Find one specific paper

oma scholar resolve "Attention Is All You Need"
# returns top hit from each source + recommendation

Fetch a sidecar or work

# knows.academy full sidecar
oma scholar get "knows:generated/reconvla/1.0.0"

# Partial fetch (claims only, ~700 tokens, 93% reduction vs PDF)
oma scholar get --section statements "knows:generated/reconvla/1.0.0"

# By DOI or OpenAlex W-id (works regardless of knows.academy availability)
oma scholar get "10.48550/arXiv.1706.03762"

When knows.academy is unreachable, get knows:... automatically falls back to OpenAlex by extracting the slug from the record_id. The result is marked with fallback: "openalex" and contains metadata + abstract, useful for running Mode 1 Generate locally.

Validate

# Strict mode for own Generate output (default)
oma scholar lint paper.knows.yaml

# Lenient mode for third-party / fetched sidecars
oma scholar lint --lenient remote.knows.yaml

# Treat warnings as failures (CI mode)
oma scholar lint --fail-on-warning paper.knows.yaml

About 47% of knows.academy-served sidecars contain at least one dangling cross-reference (typo in subject_ref/object_ref, measured across 15 production samples). Use --lenient when consuming third-party records so these surface as warnings rather than blocking errors.

Raw API (when CLI is unavailable)

curl -s "https://knows.academy/api/proxy/search?q=..."
curl -s "https://knows.academy/api/proxy/sidecars/<encoded-id>"
curl -s "https://knows.academy/api/proxy/partial?record_id=<id>&section=statements"
curl -s "https://knows.academy/api/proxy/jobs/stats"   # platform health

Configuration

Project-specific settings: config/scholar-config.yaml

Troubleshooting

| Issue | Solution | |-------|----------| | [ERROR] *.value: numeric value '22' is quoted | Remove quotes: value: '22' -> value: 22 | | [ERROR] provenance.actor.type: 'ai' is not allowed | Change to tool, person, or org | | [ERROR] *.type: use \statement_type` instead of `type`| Renametype->statement_type(orevidence_type/predicate/artifact_type) | | [ERROR] provenance.actors: v0.9 spec uses singular `actor`| Replaceactors: [{...}]array withactor: {...}object | |[ERROR] *.object_ref: reference 'X' does not match any defined id| Fix thesubject_ref/object_refto point to a real id, OR use--lenientif consuming third-party data | |[WARN] relations: avg relations/statement is N.NN (target ≥ 1.5)| Add moresupported_by/depends_onrelations | |[WARN] statements: only N statements; most papers warrant ≥ 8| Expected when generating from abstract only; full-paper Generate should hit 15+ | |[WARN] *.predicate: past-tense '...' is suspicious | Switch to present tense (evaluated_on->evaluates_on) | | Remote API returns empty results | Try broader query; check /api/proxy/jobs/stats; CLI auto-falls-back to OpenAlex | | knows.academy search failed: fetch failed(stderr) | Platform timeout; fallback to OpenAlex is automatic; retry later for sidecars | | OpenAlex 403/429 | SetOPENALEX_API_KEY(seeresources/setup-openalex.md) | | YAML won't parse | Check indentation; numbers/booleans must be unquoted; strings with :` need quotes |

References

  • Execution steps: resources/execution-protocol.md
  • Sidecar spec rules: resources/sidecar-spec.md
  • API endpoints: resources/api-endpoints.md
  • OpenAlex setup: resources/setup-openalex.md
  • Upstream spec snapshot: resources/upstream-spec-cache.md
  • Post-generation checklist: resources/checklist.md
  • CLI subcommands: oma scholar search|resolve|get|lint (implementation under cli/commands/scholar/)
  • Context loading: ../_shared/core/context-loading.md
  • Quality principles: ../_shared/core/quality-principles.md
  • i18n rules: ../../rules/i18n-guide.md