Agent Skills: Claude Scientific Skills

Comprehensive scientific research toolkit with 139 specialized skills for biology, chemistry, medicine, data science, and computational research. Transforms Claude into an AI research assistant with access to scientific databases, analysis tools, and domain-specific workflows.

UncategorizedID: oimiragieo/agent-studio/scientific-skills

Install this agent skill to your local

pnpm dlx add-skill https://github.com/oimiragieo/agent-studio/tree/HEAD/.claude/skills/scientific-skills

Skill Files

Browse the full folder contents for scientific-skills.

Download Skill

Loading file tree…

.claude/skills/scientific-skills/SKILL.md

Skill Metadata

Name
scientific-skills
Description
Comprehensive scientific research toolkit with 139 specialized skills for biology, chemistry, medicine, data science, and computational research. Transforms Claude into an AI research assistant with access to scientific databases, analysis tools, and domain-specific workflows.

Claude Scientific Skills

Overview

A comprehensive collection of 139 ready-to-use scientific skills that transform Claude into an AI research assistant capable of executing complex multi-step scientific workflows across biology, chemistry, medicine, and related fields.

When to Use

Invoke this skill when:

  • Working on scientific research tasks
  • Need access to specialized databases (PubMed, ChEMBL, UniProt, etc.)
  • Performing bioinformatics or cheminformatics analysis
  • Creating literature reviews or scientific documents
  • Analyzing single-cell RNA-seq, proteomics, or multi-omics data
  • Drug discovery and molecular analysis workflows
  • Statistical analysis and machine learning on scientific data

Quick Start

// Invoke the main skill catalog
Skill({ skill: 'scientific-skills' });

// Or invoke specific sub-skills directly
Skill({ skill: 'scientific-skills/rdkit' }); // Cheminformatics
Skill({ skill: 'scientific-skills/scanpy' }); // Single-cell analysis
Skill({ skill: 'scientific-skills/biopython' }); // Bioinformatics
Skill({ skill: 'scientific-skills/literature-review' }); // Literature review

Skill Categories

Scientific Databases (28+)

| Skill | Description | | ------------------------- | --------------------------------------- | | pubchem | Chemical compound database | | chembl-database | Bioactivity database for drug discovery | | uniprot-database | Protein sequence and function database | | pdb | Protein Data Bank structures | | drugbank-database | Drug and drug target information | | kegg | Pathway and genome database | | clinvar-database | Clinical variant interpretations | | cosmic-database | Cancer mutation database | | ensembl-database | Genome browser and annotations | | geo-database | Gene expression data | | gwas-database | Genome-wide association studies | | reactome-database | Biological pathways | | string-database | Protein-protein interactions | | alphafold-database | Protein structure predictions | | biorxiv-database | Preprint server for biology | | clinicaltrials-database | Clinical trial registry | | ena-database | European Nucleotide Archive | | fda-database | FDA drug approvals and labels | | gene-database | Gene information from NCBI | | zinc-database | Commercially available compounds | | brenda-database | Enzyme database | | clinpgx-database | Pharmacogenomics annotations | | uspto-database | Patent database |

Python Analysis Libraries (55+)

| Skill | Description | | ----------------------------------- | ---------------------------- | | rdkit | Cheminformatics toolkit | | scanpy | Single-cell RNA-seq analysis | | anndata | Annotated data matrices | | biopython | Computational biology tools | | pytorch-lightning | Deep learning framework | | scikit-learn | Machine learning library | | transformers | NLP and deep learning models | | pandas / polars / vaex | Data manipulation | | matplotlib / seaborn / plotly | Visualization | | deepchem | Deep learning for chemistry | | esm | Evolutionary Scale Modeling | | datamol | Molecular data processing | | pymatgen | Materials science | | qiskit | Quantum computing | | pymoo | Multi-objective optimization | | statsmodels | Statistical modeling | | sympy | Symbolic mathematics | | networkx | Network analysis | | geopandas | Geospatial analysis | | shap | Model explainability |

Bioinformatics & Genomics

| Skill | Description | | ------------------ | ------------------------------- | | gget | Gene and transcript information | | pysam | SAM/BAM file manipulation | | deeptools | NGS data analysis | | pydeseq2 | Differential expression | | scvi-tools | Deep learning for single-cell | | etetoolkit | Phylogenetic analysis | | scikit-bio | Bioinformatics algorithms | | bioservices | Web services for biology | | cellxgene-census | Cell atlas exploration |

Cheminformatics & Drug Discovery

| Skill | Description | | ----------- | ------------------------- | | rdkit | Molecular manipulation | | datamol | Molecular data handling | | molfeat | Molecular featurization | | diffdock | Molecular docking | | torchdrug | Drug discovery ML | | pytdc | Therapeutics data commons | | cobrapy | Metabolic modeling |

Scientific Communication

| Skill | Description | | ----------------------- | ----------------------------- | | literature-review | Systematic literature reviews | | scientific-writing | Academic writing assistance | | scientific-schematics | AI-generated figures | | scientific-slides | Presentation generation | | hypothesis-generation | Hypothesis development | | venue-templates | Journal-specific formatting | | citation-management | Reference management |

Clinical & Medical

| Skill | Description | | --------------------------- | ------------------------- | | clinical-decision-support | Clinical reasoning | | clinical-reports | Medical report generation | | treatment-plans | Treatment planning | | pyhealth | Healthcare ML | | pydicom | Medical imaging |

Laboratory & Integration

| Skill | Description | | ----------------------- | ------------------------ | | benchling-integration | Lab informatics platform | | dnanexus-integration | Genomics cloud platform | | pylabrobot | Laboratory automation | | flowio | Flow cytometry data | | omero-integration | Bioimaging platform |

Core Workflows

Literature Review Workflow

# 7-phase systematic literature review
# 1. Planning with PICO framework
# 2. Multi-database search execution
# 3. Screening with PRISMA flow
# 4. Data extraction and quality assessment
# 5. Thematic synthesis
# 6. Citation verification
# 7. PDF generation

Drug Discovery Workflow

# Using RDKit + ChEMBL + datamol
from rdkit import Chem
from rdkit.Chem import Descriptors, AllChem

# 1. Query ChEMBL for bioactivity data
# 2. Calculate molecular properties
# 3. Filter by drug-likeness (Lipinski)
# 4. Similarity screening
# 5. Substructure analysis

Single-Cell Analysis Workflow

# Using scanpy + anndata
import scanpy as sc

# 1. Load and QC data
# 2. Normalization and feature selection
# 3. Dimensionality reduction (PCA, UMAP)
# 4. Clustering (Leiden algorithm)
# 5. Marker gene identification
# 6. Cell type annotation

Hypothesis Generation Workflow

# 8-step systematic process
# 1. Understand phenomenon
# 2. Literature search
# 3. Synthesize evidence
# 4. Generate competing hypotheses
# 5. Evaluate quality
# 6. Design experiments
# 7. Formulate predictions
# 8. Generate report

Sub-Skill Structure

Each sub-skill follows a consistent structure:

scientific-skills/
├── SKILL.md                    # This file (catalog/index)
├── skills/                     # Individual skill directories
│   ├── rdkit/
│   │   ├── SKILL.md           # Skill documentation
│   │   ├── references/        # API references, patterns
│   │   └── scripts/           # Example scripts
│   ├── scanpy/
│   ├── biopython/
│   └── ... (139 total)

Invoking Sub-Skills

Direct Invocation

// Invoke specific skill
Skill({ skill: 'scientific-skills/rdkit' });
Skill({ skill: 'scientific-skills/scanpy' });

Chained Workflows

// Multi-skill workflow
Skill({ skill: 'scientific-skills/literature-review' });
Skill({ skill: 'scientific-skills/hypothesis-generation' });
Skill({ skill: 'scientific-skills/scientific-schematics' });

Prerequisites

  • Python 3.9+ (3.12+ recommended)
  • uv package manager (recommended)
  • Platform: macOS, Linux, or Windows with WSL2

Best Practices

  1. Start with the right skill: Use the category tables above to find appropriate skills
  2. Chain skills for complex workflows: Literature review → Hypothesis → Experiment design
  3. Use database skills for data access: Query databases before analysis
  4. Visualize results: Use matplotlib/seaborn/plotly skills for publication-quality figures
  5. Document findings: Use scientific-writing skill for formal documentation

Integration with Agent Framework

Recommended Agent Pairings

| Agent | Scientific Skills | | -------------------- | ------------------------------------- | | data-engineer | polars, dask, vaex, zarr-python | | python-pro | All Python-based skills | | database-architect | Database skills for schema design | | technical-writer | literature-review, scientific-writing |

Example Agent Spawn

Task({
  task_id: 'task-1',
  subagent_type: 'python-pro',
  description: 'Analyze molecular dataset with RDKit',
  prompt: `You are the PYTHON-PRO agent with scientific research expertise.

## Task
Analyze the molecular dataset for drug-likeness properties.

## Skills to Invoke
1. Skill({ skill: "scientific-skills/rdkit" })
2. Skill({ skill: "scientific-skills/datamol" })

## Workflow
1. Load molecular data
2. Calculate descriptors
3. Apply Lipinski filters
4. Generate visualization
5. Report findings
`,
});

Resources

Bundled Documentation

  • skills/*/SKILL.md - Individual skill documentation
  • skills/*/references/ - API references and patterns
  • skills/*/scripts/ - Example scripts and templates

External Resources

Iron Laws

  1. ALWAYS query scientific databases (PubMed, ChEMBL, UniProt) before performing any analysis — raw analysis without literature and database context produces uninformed conclusions that duplicate prior work.
  2. NEVER perform analysis without documenting all steps (data sources, parameters, library versions, transformations) — undocumented research is irreproducible and cannot be peer-reviewed or extended.
  3. ALWAYS chain multiple domain-specific skills for complex workflows — single-tool analysis misses interdependencies across biology, chemistry, and clinical domains.
  4. NEVER report findings without statistical validation — scientific claims require appropriately sized samples, validated methods, and quantified uncertainty.
  5. ALWAYS visualize intermediate results after each major processing step — data errors and outliers surface in visualizations before propagating silently to final conclusions.

Anti-Patterns

| Anti-Pattern | Why It Fails | Correct Approach | | ----------------------------------------------------------- | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | | Performing analysis without querying databases first | Missing context from existing literature duplicates known work and misses prior art | Query PubMed/ChEMBL/UniProt before analysis to ground work in existing scientific knowledge | | Using a single tool for complex multi-domain analysis | Single-tool analysis misses domain boundary interdependencies | Chain multiple domain-specific skills (rdkit for chemistry, scanpy for single-cell, biopython for genomics) | | Skipping intermediate visualization during data processing | Errors and outliers propagate silently from preprocessing to final results | Visualize data distribution and quality metrics after each major transformation step | | Generating hypotheses without reviewing existing literature | Reinvents known solutions and ignores contradictory prior findings | Always invoke literature-review skill first; only generate hypotheses after reviewing existing evidence | | Reporting findings without documenting analysis provenance | Research cannot be reproduced, verified, or extended by other researchers | Log all data sources, version numbers, parameters, and transformation steps in the research report |

Memory Protocol (MANDATORY)

Before starting: Read .claude/context/memory/learnings.md

After completing:

  • New pattern → .claude/context/memory/learnings.md
  • Issue found → .claude/context/memory/issues.md
  • Decision made → .claude/context/memory/decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

Version History

  • v2.17.0 - Current version with 139 skills
  • Integrated from K-Dense-AI/claude-scientific-skills repository

License

MIT License - Open source and freely available for research and commercial use.