dipspeaks – Key Public Functions

clump_candidates

Identify clumped dips between high- and low-energy features based on overlap and relative prominence.

Parameters

high_dipspandas.DataFrame

Columns required:

  • ti – start time

  • te – end time

  • relprominence – relative prominence

  • t – representative time (for plots)

low_dipspandas.DataFrame

Same columns as high_dips.

lcpandas.DataFrame

Light-curve context; needs t (time) and c (rate).

overlap_thresholdfloat, default 0.75

Minimum fractional overlap (both directions).

bin_numberint, default 100

Number of histogram bins when plotting.

show_plotbool, default False

Show diagnostic histograms if True.

Returns

high_clumppandas.DataFrame

Subset of high_dips that meet the criteria.

low_clumppandas.DataFrame

Matching subset from low_dips.

filter_dip_peak

Filter detected dips/peaks by reconstruction-error percentile and z-score, optionally plot them, and estimate their “real” probability.

Parameters

datasetpandas.DataFrame

Must contain error_percentile, zscores, pos, t.

simdatasetpandas.DataFrame

Synthetic (noise-only) features; needs t.

lc_rebobject

Re-binned light curve with attributes t and c.

error_percentile_thresholdfloat, default 0.9

Keep features above this percentile.

zscore_thresholdfloat, default 4

Keep features with z-score ≥ threshold.

show_plotbool, default True

Overlay surviving features on the light curve.

Returns

pandas.DataFrame

Filtered subset of dataset (index reset).

Side effect – prints the probability of observing that event rate in noise.

gmm_dips_peaks

Perform Gaussian-mixture clustering on dip/peak data.

Parameters

good_pdpandas.DataFrame

Data to cluster.

log_scalebool

If True, apply log10 to the features before clustering.

Returns

cluster_stats_dfpandas.DataFrame

Per-cluster statistics.

cluster_labelsnumpy.ndarray

Cluster label for every row in good_pd.

overlap

Compute pair-wise overlap durations and ratios between two sets of features (dips or peaks).

Parameters

highpandas.DataFrame

Columns ti and te.

lowpandas.DataFrame`

Same two columns.

Returns

overlap_durations : numpy.ndarray high_indices : numpy.ndarray low_indices : numpy.ndarray high_overlap_ratio: numpy.ndarray low_overlap_ratio : numpy.ndarray

Math definitions

\[\begin{split}r_\\text{high} \\,=\\, \\frac{\\text{overlap}} {\\text{high.te} - \\text{high.ti}} \\qquad r_\\text{low} \\,=\\, \\frac{\\text{overlap}} {\\text{low.te} - \\text{low.ti}}\end{split}\]

detect_dips_and_peaks

Detect dips and peaks in a light curve via S/N thresholding, synthetic-data generation, and autoencoder-based anomaly detection.

Parameters

lcstr

Path to the input light-curve text file.

snr : float, default 0.15 index_time : int, default 0 index_rate : int, default 1 index_error_rate : int, default 2 num_simulations : int, default 1 show_plot : bool, default True

Returns

peaks_to_clean : pandas.DataFrame dips_to_clean : pandas.DataFrame lcreb : pandas.DataFrame speaks_to_clean : pandas.DataFrame sdips_to_clean : pandas.DataFrame

rebin_snr

Re-bin a signal to achieve a target S/N threshold.

Parameters

t : array-like (time) x : array-like (signal) sy : array-like (uncertainty) snr_threshold : float

Returns

t_new : array-like c_new : array-like sc_new: array-like

scale

Linearly scale x so that its range matches y (useful for overlays).

Parameters

x : array-like y : array-like

Returns

x_new : array-like