draco.util.rfi

Collection of routines for RFI excision.

Functions

`sir`(basemask[, eta, only_freq, only_time])	Apply the SIR operator over the frequency and time axes for each product.
`sir1d`(basemask[, eta])	Numpy implementation of the scale-invariant rank (SIR) operator.
`sumthreshold`(data[, max_m, start_flag, ...])	SumThreshold outlier detection algorithm.
`sumthreshold_py`(data[, max_m, start_flag, ...])	SumThreshold outlier detection algorithm.

draco.util.rfi.sir(basemask, eta=0.2, only_freq=False, only_time=False)[source]

Apply the SIR operator over the frequency and time axes for each product.

This is a wrapper for sir1d. It loops over times, applying sir1d across the frequency axis. It then loops over frequencies, applying sir1d across the time axis. It returns the logical OR of these two masks.

Parameters:

basemask (np.ndarray[nfreq, nprod, ntime] of boolean type) – The previously generated threshold mask. 1 (True) for masked points, 0 (False) otherwise.
eta (float) – Aggressiveness of the method: with eta=0, no additional samples are flagged and the function returns basemask. With eta=1, all samples will be flagged.
only_freq (bool) – Only apply the SIR operator across the frequency axis.
only_time (bool) – Only apply the SIR operator across the time axis.

Returns:

mask – The mask after the application of the SIR operator.

Return type:

np.ndarray[nfreq, nprod, ntime] of boolean type

draco.util.rfi.sir1d(basemask, eta=0.2)[source]

Numpy implementation of the scale-invariant rank (SIR) operator.

For more information, see arXiv:1201.3364v2.

Parameters:

basemask (numpy 1D array of boolean type) – Array with the threshold mask previously generated. 1 (True) for flagged points, 0 (False) otherwise.
eta (float) – Aggressiveness of the method: with eta=0, no additional samples are flagged and the function returns basemask. With eta=1, all samples will be flagged. The authors in arXiv:1201.3364v2 seem to be convinced that 0.2 is a mostly universally optimal value, but no optimization has been done on CHIME data.

Returns:

mask – The mask after the application of the (SIR) operator. Same shape and type as basemask.

Return type:

numpy 1D array of boolean type

draco.util.rfi.sumthreshold(data, max_m=16, start_flag=None, threshold1=None, remove_median=True, correct_for_missing=True, variance=None, rho=None, axes=None, only_positive=False)

SumThreshold outlier detection algorithm.

See https://andreoffringa.org/pdfs/SumThreshold-technical-report.pdf for description of the algorithm.

Parameters:

data (np.ndarray[:, :]) – The data to flag.
max_m (int, optional) – Maximum size to expand to.
start_flag (np.ndarray[:, :], optional) – A boolean array of the initially flagged data.
threshold1 (float, optional) – Initial threshold. By default use the 95 percentile.
remove_median (bool, optional) – Subtract the median of the full 2D dataset. Default is True.
correct_for_missing (bool, optional) – Correct for missing counts
variance (np.ndarray[:, :], optional) – Estimate of the uncertainty on each data point. If provided, then correct_for_missing=True should be set and threshold1 should be provided in units of “sigma”.
rho (float, optional) – Controls the dependence of the threshold on the window size m, specifically threshold = threshold1 / rho ** log2(m). If not provided, will use a value of 1.5 (0.9428) when correct_for_missing is False (True). This is to maintain backward compatibility.
axes (tuple | int, optional) – Axes of data along which to calculate. Flagging is done in the order in which axes is provided. By default, loop over all axes in reverse order.
only_positive (bool, optional) – Only flag positive excursions, do not flag negative excursions.

Returns:

mask – Boolean array, with True entries marking outlier data.

Return type:

np.ndarray[:, :]

draco.util.rfi.sumthreshold_py(data, max_m=16, start_flag=None, threshold1=None, remove_median=True, correct_for_missing=True, variance=None, rho=None, axes=None, only_positive=False)[source]

SumThreshold outlier detection algorithm.

See https://andreoffringa.org/pdfs/SumThreshold-technical-report.pdf for description of the algorithm.

Parameters:

data (np.ndarray[:, :]) – The data to flag.
max_m (int, optional) – Maximum size to expand to.
start_flag (np.ndarray[:, :], optional) – A boolean array of the initially flagged data.
threshold1 (float, optional) – Initial threshold. By default use the 95 percentile.
remove_median (bool, optional) – Subtract the median of the full 2D dataset. Default is True.
correct_for_missing (bool, optional) – Correct for missing counts
variance (np.ndarray[:, :], optional) – Estimate of the uncertainty on each data point. If provided, then correct_for_missing=True should be set and threshold1 should be provided in units of “sigma”.
rho (float, optional) – Controls the dependence of the threshold on the window size m, specifically threshold = threshold1 / rho ** log2(m). If not provided, will use a value of 1.5 (0.9428) when correct_for_missing is False (True). This is to maintain backward compatibility.
axes (tuple | int, optional) – Axes of data along which to calculate. Flagging is done in the order in which axes is provided. By default, loop over all axes in reverse order.
only_positive (bool, optional) – Only flag positive excursions, do not flag negative excursions.

Returns:

mask – Boolean array, with True entries marking outlier data.

Return type:

np.ndarray[:, :]