Gravitational Wave Data Conditioning
Data conditioning is essential before matched filtering. Raw gravitational wave detector data contains low-frequency noise, instrumental artifacts, and needs proper sampling rates for computational efficiency.
Overview
The conditioning pipeline typically involves:
- High-pass filtering (remove low-frequency noise below ~15 Hz)
- Resampling (downsample to appropriate sampling rate)
- Crop filter wraparound (remove edge artifacts from filtering)
- PSD estimation (calculate power spectral density for matched filtering)
High-Pass Filtering
Remove low-frequency noise and instrumental artifacts:
from pycbc.filter import highpass
# High-pass filter at 15 Hz (typical for LIGO/Virgo data)
strain_filtered = highpass(strain, 15.0)
# Common cutoff frequencies:
# 15 Hz: Standard for ground-based detectors
# 20 Hz: Higher cutoff, more aggressive noise removal
# 10 Hz: Lower cutoff, preserves more low-frequency content
Why 15 Hz? Ground-based detectors like LIGO/Virgo have significant low-frequency noise. High-pass filtering removes this noise while preserving the gravitational wave signal (typically >20 Hz for binary mergers).
Resampling
Downsample the data to reduce computational cost:
from pycbc.filter import resample_to_delta_t
# Resample to 2048 Hz (common for matched filtering)
delta_t = 1.0 / 2048
strain_resampled = resample_to_delta_t(strain_filtered, delta_t)
# Or to 4096 Hz for higher resolution
delta_t = 1.0 / 4096
strain_resampled = resample_to_delta_t(strain_filtered, delta_t)
# Common sampling rates:
# 2048 Hz: Standard, computationally efficient
# 4096 Hz: Higher resolution, better for high-mass systems
Note: Resampling should happen AFTER high-pass filtering to avoid aliasing. The Nyquist frequency (half the sampling rate) must be above the signal frequency of interest.
Crop Filter Wraparound
Remove edge artifacts introduced by filtering:
# Crop 2 seconds from both ends to remove filter wraparound
conditioned = strain_resampled.crop(2, 2)
# The crop() method removes time from start and end:
# crop(start_seconds, end_seconds)
# Common values: 2-4 seconds on each end
# Verify the duration
print(f"Original duration: {strain_resampled.duration} s")
print(f"Cropped duration: {conditioned.duration} s")
Why crop? Digital filters introduce artifacts at the edges of the time series. These artifacts can cause false triggers in matched filtering.
Power Spectral Density (PSD) Estimation
Calculate the PSD needed for matched filtering:
from pycbc.psd import interpolate, inverse_spectrum_truncation
# Estimate PSD using Welch's method
# seg_len: segment length in seconds (typically 4 seconds)
psd = conditioned.psd(4)
# Interpolate PSD to match data frequency resolution
psd = interpolate(psd, conditioned.delta_f)
# Inverse spectrum truncation for numerical stability
# This limits the effective filter length
psd = inverse_spectrum_truncation(
psd,
int(4 * conditioned.sample_rate),
low_frequency_cutoff=15
)
# Check PSD properties
print(f"PSD length: {len(psd)}")
print(f"PSD delta_f: {psd.delta_f}")
print(f"PSD frequency range: {psd.sample_frequencies[0]:.2f} - {psd.sample_frequencies[-1]:.2f} Hz")
PSD Parameters Explained
- Segment length (4 seconds): Longer segments give better frequency resolution but fewer averages. 4 seconds is a good balance.
- Low frequency cutoff (15 Hz): Should match your high-pass filter cutoff. Frequencies below this are not well-characterized.
Best Practices
- Always high-pass filter first: Remove low-frequency noise before resampling
- Choose appropriate sampling rate: 2048 Hz is standard, 4096 Hz for high-mass systems
- Crop enough time: 2 seconds is minimum, but may need more for longer templates
- Match PSD cutoff to filter: PSD low-frequency cutoff should match high-pass filter frequency
- Verify data quality: Plot the conditioned strain to check for issues
Dependencies
pip install pycbc
References
- PyCBC Tutorial: Waveform & Matched Filter - Practical tutorial covering data conditioning, PSD estimation, and matched filtering
- PyCBC Filter Documentation
- PyCBC PSD Documentation
Common Issues
Problem: PSD estimation fails with "must contain at least one sample" error
- Solution: Ensure data is long enough after cropping (need several segments for Welch method)
Problem: Filter wraparound artifacts in matched filtering
- Solution: Increase crop amount or check that filtering happened before cropping
Problem: Poor SNR due to low-frequency noise
- Solution: Increase high-pass filter cutoff frequency or check PSD inverse spectrum truncation