Agent Skills: Great Expectations

Data validation using Great Expectations. Expectation suites, checkpoints, and data docs for pipeline monitoring.

UncategorizedID: majesticlabs-dev/majestic-marketplace/great-expectations

Install this agent skill to your local

pnpm dlx add-skill https://github.com/majesticlabs-dev/majestic-marketplace/tree/HEAD/plugins/majestic-data/skills/great-expectations

Skill Files

Browse the full folder contents for great-expectations.

Download Skill

Loading file tree…

plugins/majestic-data/skills/great-expectations/SKILL.md

Skill Metadata

Name
great-expectations
Description
Data validation using Great Expectations. Expectation suites, checkpoints, and data docs for pipeline monitoring.

Great Expectations

Audience: Data engineers building validated data pipelines.

Goal: Provide GX patterns for expectation-based validation and monitoring.

Scripts

Execute GX functions from scripts/expectations.py:

from scripts.expectations import (
    get_pandas_context,
    add_dataframe_asset,
    create_basic_suite,
    run_validation
)

Usage Examples

Quick Setup

from scripts.expectations import get_pandas_context, add_dataframe_asset

context, datasource = get_pandas_context("my_datasource")
batch_request = add_dataframe_asset(datasource, "users", df)

Create Expectation Suite

from scripts.expectations import create_basic_suite

columns_config = {
    'user_id': {'not_null': True, 'unique': True, 'type': 'int'},
    'age': {'min': 0, 'max': 150},
    'status': {'values': ['active', 'inactive', 'pending']},
    'email': {'regex': r'^[\w\.-]+@[\w\.-]+\.\w+$'}
}

suite = create_basic_suite(context, "user_suite", columns_config)

Run Validation

from scripts.expectations import run_validation

results = run_validation(
    context,
    checkpoint_name="user_checkpoint",
    batch_request=batch_request,
    suite_name="user_suite"
)

if results['success']:
    print("All expectations passed!")
else:
    for failure in results['failures']:
        print(f"Failed: {failure['expectation']} on {failure['column']}")

Common Expectations Reference

| Category | Expectation | Description | |----------|-------------|-------------| | Table | ExpectTableRowCountToBeBetween | Row count range | | Existence | ExpectColumnToExist | Column must exist | | Nulls | ExpectColumnValuesToNotBeNull | No null values | | Range | ExpectColumnValuesToBeBetween | Value bounds | | Set | ExpectColumnValuesToBeInSet | Allowed values | | Pattern | ExpectColumnValuesToMatchRegex | Regex match | | Unique | ExpectColumnValuesToBeUnique | No duplicates |

Data Docs

# Build and open HTML reports
context.build_data_docs()
context.open_data_docs()

Directory Structure

great_expectations/
├── great_expectations.yml     # Config
├── expectations/              # Expectation suites (JSON)
├── checkpoints/               # Checkpoint definitions
├── plugins/                   # Custom expectations
└── uncommitted/
    ├── data_docs/            # Generated HTML docs
    └── validations/          # Validation results

When to Use Great Expectations

| Use Case | GX | Alternative | |----------|-----|-------------| | Pipeline monitoring | ✓ | - | | Data warehouse validation | ✓ | - | | Automated data docs | ✓ | - | | Simple DataFrame checks | - | Pandera | | Record-level API validation | - | Pydantic |

Dependencies

great_expectations>=0.18
pandas