Agent Skills: numpy-indexing

Advanced indexing techniques including slicing, fancy indexing, and boolean masks, along with memory implications of views vs. copies. Triggers: indexing, slicing, fancy indexing, boolean mask, np.where, np.ix_.

UncategorizedID: cuba6112/skillfactory/numpy-indexing

Install this agent skill to your local

pnpm dlx add-skill https://github.com/cuba6112/skillfactory/tree/HEAD/skills/numpy-indexing

Skill Files

Browse the full folder contents for numpy-indexing.

Download Skill

Loading file tree…

skills/numpy-indexing/SKILL.md

Skill Metadata

Name
numpy-indexing
Description
Advanced indexing techniques including slicing, fancy indexing, and boolean masks, along with memory implications of views vs. copies. Triggers: indexing, slicing, fancy indexing, boolean mask, np.where, np.ix_.

Overview

Indexing in NumPy ranges from basic slicing (zero-copy) to advanced "fancy" indexing (always creates a copy). Understanding the distinction is vital for memory management and avoiding unintended side effects in data analysis.

When to Use

  • Extracting sub-regions of arrays for processing.
  • Filtering data based on complex conditional logic (boolean masking).
  • Selecting arbitrary elements using coordinate lists.
  • Managing memory when dealing with large datasets that have small regions of interest.

Decision Tree

  1. Do you need a view or a copy?
    • View: Use basic slicing (arr[0:5]).
    • Copy: Use advanced indexing (arr[[0, 1, 2]]) or .copy().
  2. Are you filtering by value?
    • Use a boolean mask: arr[arr > threshold].
  3. Selecting a grid of values across axes?
    • Use np.ix_ to construct the selection mesh.

Workflows

  1. Filtering Data with Boolean Masks

    • Apply a comparison operator (e.g., x > 0) to an array to create a boolean mask.
    • Pass the mask into the array's indexing brackets: x[mask].
    • Operate on the resulting array (note that this is a copy, not a view).
  2. Memory-Efficient Sub-array Extraction

    • Slice a small portion from a large ndarray.
    • Call .copy() on the slice to create a new independent array.
    • Delete the original large array to free system memory.
  3. Cross-Axis Selection with np.ix_

    • Define row indices and column indices as separate lists.
    • Pass them into np.ix_ to construct the appropriate broadcasting meshes.
    • Apply the resulting objects to the array to select a sub-grid of values.

Non-Obvious Insights

  • Memory Leak Risks: Small views of large arrays prevent garbage collection of the entire base array; always copy small slices of massive data.
  • Copy vs. View Rule: Basic slicing always returns a view; advanced indexing (using non-tuple sequences or arrays) always returns a copy.
  • Adjacent Indexing: Mixing basic and advanced indexing behavior changes significantly based on whether the advanced indices are adjacent in the index tuple.

Evidence

  • "All arrays generated by basic slicing are always views of the original array." Source
  • "Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view)." Source

Scripts

  • scripts/numpy-indexing_tool.py: Demonstrates boolean masking and sub-array extraction.
  • scripts/numpy-indexing_tool.js: Simulated coordinate selection logic.

Dependencies

  • numpy (Python)

References