pandas-dataframe-analyzer Skill

pandas-dataframe-analyzer

Overview

Automated DataFrame analysis skill for statistical summaries, missing value detection, data type inference, and memory optimization recommendations using pandas and profiling libraries.

Capabilities

Statistical profiling of DataFrames
Missing value pattern detection
Data type optimization suggestions
Memory footprint analysis
Duplicate detection and handling
Distribution analysis and visualization
Correlation matrix computation
Cardinality analysis for categorical features

Target Processes

Exploratory Data Analysis (EDA) Pipeline
Data Collection and Validation Pipeline
Feature Engineering Design and Implementation

Tools and Libraries

pandas
pandas-profiling / ydata-profiling
numpy
scipy (for statistical tests)

Input Schema

{
  "type": "object",
  "required": ["dataPath"],
  "properties": {
    "dataPath": {
      "type": "string",
      "description": "Path to the data file (CSV, Parquet, JSON)"
    },
    "sampleSize": {
      "type": "integer",
      "description": "Number of rows to sample for analysis",
      "default": 10000
    },
    "profileType": {
      "type": "string",
      "enum": ["minimal", "standard", "full"],
      "default": "standard"
    },
    "outputFormat": {
      "type": "string",
      "enum": ["json", "html", "markdown"],
      "default": "json"
    }
  }
}

Output Schema

{
  "type": "object",
  "required": ["summary", "columns", "recommendations"],
  "properties": {
    "summary": {
      "type": "object",
      "properties": {
        "rowCount": { "type": "integer" },
        "columnCount": { "type": "integer" },
        "memoryUsageMB": { "type": "number" },
        "duplicateRows": { "type": "integer" },
        "missingCells": { "type": "integer" },
        "missingCellsPercent": { "type": "number" }
      }
    },
    "columns": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "dtype": { "type": "string" },
          "nullCount": { "type": "integer" },
          "uniqueCount": { "type": "integer" },
          "stats": { "type": "object" }
        }
      }
    },
    "recommendations": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "type": { "type": "string" },
          "column": { "type": "string" },
          "suggestion": { "type": "string" },
          "impact": { "type": "string" }
        }
      }
    }
  }
}

Usage Example

{
  kind: 'skill',
  title: 'Analyze training dataset',
  skill: {
    name: 'pandas-dataframe-analyzer',
    context: {
      dataPath: 'data/train.csv',
      profileType: 'full',
      outputFormat: 'json'
    }
  }
}

Agent Skills: pandas-dataframe-analyzer

Install this agent skill to your local

Skill Files

pandas-dataframe-analyzer

Overview

Capabilities

Target Processes

Tools and Libraries

Input Schema

Output Schema

Usage Example