Overview
NumPy memory management revolves around the concept of "strides." Strides define the number of bytes to skip in a flat 1D buffer to move to the next element in an N-dimensional space. Understanding this allows for zero-copy operations like transposition and complex sliding windows.
When to Use
- Optimizing high-performance code for CPU cache locality.
- Creating overlapping sliding windows for signal processing without duplicating data.
- Interfacing with libraries that require specific memory orders (e.g., BLAS/Fortran).
- Manipulating array logic without incurring the cost of data copying.
Decision Tree
- Need to optimize for row-wise processing?
- Use C-order (row-major). Smallest stride is on the last axis.
- Interfacing with legacy Fortran or BLAS?
- Use Fortran-order (column-major). Smallest stride is on the first axis.
- Want to create a sliding window view?
- Use
np.lib.stride_tricks.as_strided. (Use with caution).
- Use
Workflows
-
Analyzing Memory Locality
- Check
arr.stridesandarr.itemsize. - Identify the axis with the smallest stride (fastest changing in memory).
- Reorder loops or axes to ensure data access follows the smallest stride for cache efficiency.
- Check
-
Zero-Copy Transposition
- Call
arr.Torarr.transpose(). - Inspect the strides of the result.
- Note that the underlying data buffer remains identical; only the stride metadata was swapped.
- Call
-
Creating Sliding Windows without Copies
- Identify an array segment.
- Use
np.lib.stride_tricks.as_stridedwith a custom stride tuple to create overlapping windows. - Perform vectorized operations on the windowed view (warning: values may overlap).
Non-Obvious Insights
- Metadata Logic: Changing an array's shape or transposing it often changes only the strides and shape metadata, leaving the data buffer untouched.
- Cache Performance: Iterating over an axis with a large stride is slow because elements are spaced far apart in memory, causing CPU cache misses.
- Strides Modification: Directly setting
arr.stridesis unsafe and discouraged;as_stridedis the preferred, safer interface for advanced memory manipulation.
Evidence
- "The strides of an array tell us how many bytes we have to skip in memory to move to the next position along a certain axis." Source
- "The shape of the array can be changed very easily without changing anything in the data buffer or any data copying at all." Source
Scripts
scripts/numpy-memory_tool.py: Tools for stride analysis and zero-copy window creation.scripts/numpy-memory_tool.js: Simulated byte-offset calculator.
Dependencies
numpy(Python)