draco.analysis.flagging

Tasks for flagging out bad or unwanted data.

This includes data quality flagging on timestream data; sun excision on sidereal data; and pre-map making flagging on m-modes.

The convention for flagging/masking is True for contaminated samples that should be excluded and False for clean samples.

Functions

complex_med(x, *args, **kwargs)

Complex median, done by applying to the real/imag parts individually.

destripe(x, w[, axis])

Subtract the median along a specified axis.

inverse_binom_cdf_prob(k, N, F)

Calculate the trial probability that gives the CDF.

mad(x, mask[, base_size, mad_size, debug, sigma])

Calculate the MAD of freq-time data.

medfilt(x, mask, size, *args)

Apply a moving median filter to masked data.

p_to_sigma(p)

Get the sigma exceeded by the tails of a Gaussian with probability p.

sigma_to_p(sigma)

Get the probability of an excursion larger than sigma for a Gaussian.

tv_channels_flag(x, freq[, sigma, f, debug])

Perform a higher sensitivity flagging for the TV stations.

Classes

ApplyBaselineMask()

Apply a distributed mask that varies across baselines.

ApplyGenericMask()

Apply a mask to a dataset with arbitrary axes.

ApplyLocalizedRFIMask()

Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights.

ApplyRFIMask

alias of ApplyTimeFreqMask

ApplyTimeFreqMask()

Apply a time-frequency mask to the data.

BlendStack()

Mix a small amount of a stack into data to regularise RFI gaps.

CollapseBaselineMask()

Collapse a baseline-dependent mask along the baseline axis.

CombineMasks()

Combine an arbitrary number of masks (conservatively).

DayMask()

Crudely simulate a masking out of the daytime data.

FindBeamformedOutliers()

Identify beamformed visibilities that deviate from our expectation for noise.

MaskBadGains()

Get a mask of regions with bad gain.

MaskBaselines()

Mask out baselines from a dataset.

MaskBeamformedOutliers

alias of ApplyGenericMask

MaskBeamformedWeights()

Mask beamformed visibilities with anomalously large weights before stacking.

MaskData

alias of MaskMModeData

MaskFreq()

Make a mask for certain frequencies.

MaskMModeData()

Mask out mmode data ahead of map making.

RFIMask()

Crappy RFI masking.

RFIMaskChisqHighDelay()

Mask frequencies and times with anomalous chi-squared test statistic.

RFISensitivityMask()

Identify RFI as deviations in system sensitivity from expected radiometer noise.

RFIStokesIMask()

Two-stage RFI filter based on Stokes I visibilities.

RadiometerWeight()

Update vis_weight according to the radiometer equation.

ReduceMaskEl()

Reduce the 'el' axis from input classes and produce corresponding reduced output classes.

SanitizeWeights()

Flags weights outside of a valid range.

SiderealMaskConversion()

Convert the axis of an RFI mask from time to ra.

SmoothVisWeight()

Smooth the visibility weights with a median filter.

ThresholdVisWeightBaseline()

Form a mask corresponding to weights that are below some threshold.

ThresholdVisWeightFrequency()

Create a mask to remove all weights below a per-frequency threshold.

class draco.analysis.flagging.ApplyBaselineMask[source]

Bases: SingleTask

Apply a distributed mask that varies across baselines.

No broadcasting is done, so the data and mask should have the same axes. This shouldn’t be used for non-distributed time-freq masks.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:

{“all”, “none”, “vis”, “map”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: TimeStream, mask: BaselineMask) TimeStream[source]
process(data: SiderealStream, mask: SiderealBaselineMask) SiderealStream

Flag data by zeroing the weights.

Parameters:
  • data – Data to apply mask to. Must have a stack axis

  • mask – A baseline-dependent mask

Returns:

The masked data. Masking is done in place.

Return type:

data

class draco.analysis.flagging.ApplyGenericMask[source]

Bases: SingleTask

Apply a mask to a dataset with arbitrary axes.

All of the mask axes must be present in the dataset, but the dataset can have additional axes.

Assumes that a sample marked True in the mask dataset should be flagged.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: ContainerBase, mask: ContainerBase)[source]

Apply the mask to the dataset weights.

Reorder the mask axes and add broadcasting axes if necessary.

Parameters:
  • data – Any container with a frequency axis.

  • mask – Any container whose axes are a subset of the axes in data

Returns:

The input container with the weight dataset set to zero for masked samples.

Return type:

data

class draco.analysis.flagging.ApplyLocalizedRFIMask[source]

Bases: SingleTask

Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights.

This class extends the class ApplyTimeFreqMask to include el in addition to freq and ra, and can be further extended for a new RingMap class (freq,el,time). Note that while the ra and el axes of the tstream and mask datasets do not need to be identical, they must have overlapping regions. However, their freq axes must be identical.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:

{“all”, “none”, “map”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(tstream, rfimask)[source]

Apply the mask by zeroing the weights.

Parameters:
Returns:

tstream – The masked RingMap with weights modified in overlapping regions. Note that the masking is done in place.

Return type:

containers.RingMap

draco.analysis.flagging.ApplyRFIMask

alias of ApplyTimeFreqMask

class draco.analysis.flagging.ApplyTimeFreqMask[source]

Bases: SingleTask

Apply a time-frequency mask to the data.

Typically this is used to mask out all inputs at times and frequencies contaminated by RFI.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:

{“all”, “none”, “vis”, “map”}

collapse_pol

Take the logical OR of the mask along the polarisation axis prior to applying it to the data. In other words, mask a frequency and time in all polarisations if it was identified as contaminated in any polarisation.

Type:

bool

match_axes

If True (default), the rfimask and tstream must have identical time-like axis. Otherwise, the mask is applied only to the overlapping region of the time-like axis. Non-overlapping regions remain unchanged. Samples must still have the same RA or timestamp values in overlapping regions.

Type:

bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(tstream, rfimask)[source]

Apply the mask by zeroing the weights.

Parameters:
  • tstream (timestream or sidereal stream) – A timestream or sidereal stream like container. For example, containers.TimeStream, andata.CorrData or containers.SiderealStream.

  • rfimask (containers.RFIMask, containers.RFIMaskByPol,) – containers.SiderealRFIMask, containers.SiderealRFIMaskByPol An RFI mask for the same period of time.

Returns:

tstream – The masked timestream. Note that the masking is done in place.

Return type:

timestream or sidereal stream

class draco.analysis.flagging.BlendStack[source]

Bases: SingleTask

Mix a small amount of a stack into data to regularise RFI gaps.

This is designed to mix in a small amount of a stack into a day of data (which will have RFI masked gaps) to attempt to regularise operations which struggle to deal with time variable masks, e.g. DelaySpectrumEstimator.

frac

The relative weight to give the stack in the average. This multiplies the weights already in the stack, and so it should be remembered that these may already be significantly higher than the single day weights.

Type:

float, optional

match_median

Estimate the median in the time/RA direction from the common samples and use this to match any quasi time-independent bias of the data (e.g. cross talk).

Type:

bool, optional

subtract

Rather than taking an average, instead subtract out the blending stack from the input data in the common samples to calculate the difference between them. The interpretation of frac is a scaling of the inverse variance of the stack to an inverse variance of a prior on the difference, e.g. a frac = 1e-4 means that we expect the standard deviation of the difference between the data and the stacked data to be 100x larger than the noise of the stacked data.

Type:

bool, optional

mask_freq

Maintain masking if a frequency is entirely flagged - i.e., even if blending data exists in those bands, do not blend.

Type:

bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Blend a small amount of the stack into the incoming data.

Parameters:

data (SiderealStream, RingMap,or HybridVisStream) – The data to be blended into. This is modified in place.

Returns:

data_blend – The modified data. This is the same object as the input, and it has been modified in place.

Return type:

SiderealStream, RingMap,or HybridVisStream

setup(data_stack)[source]

Set the stacked data.

Parameters:

data_stack (SiderealStream, RingMap,or HybridVisStream) – Data stack to blend

class draco.analysis.flagging.CollapseBaselineMask[source]

Bases: SingleTask

Collapse a baseline-dependent mask along the baseline axis.

The output is a frequency/time mask that is True for any freq/time sample for which any baseline is masked in the input mask.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(baseline_mask: BaselineMask | SiderealBaselineMask) RFIMask | SiderealRFIMask[source]

Collapse input mask over baseline axis.

Parameters:

baseline_mask (BaselineMask or SiderealBaselineMask) – Input baseline-dependent mask

Returns:

mask_cont – Output baseline-independent mask.

Return type:

RFIMask or SiderealRFIMask

class draco.analysis.flagging.CombineMasks[source]

Bases: SingleTask

Combine an arbitrary number of masks (conservatively).

All of the given masks must be of the same type and that type must have a mask dataset. Any flagged value in any of the provided masks will be flagged in the output mask.

Assumes that a sample marked True is flagged.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(masks: list[ContainerBase])[source]

Combine the given list of masks into a single mask.

Parameters:

masks – A list of containers that all have the same type. The type must have a mask dataset.

Returns:

A combined mask such that any flagged value in any of the input masks is flagged in the output mask.

Return type:

combined_mask

class draco.analysis.flagging.DayMask[source]

Bases: SingleTask

Crudely simulate a masking out of the daytime data.

start, end

Start and end of masked out region.

Type:

float

width

Use a smooth transition of given width between the fully masked and unmasked data. This is interior to the region marked by start and end.

Type:

float

zero_data

Zero the data in addition to modifying the noise weights (default is True).

Type:

bool, optional

remove_average

Estimate and remove the mean level from each visibilty. This estimate does not use data from the masked region.

Type:

bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sstream)[source]

Apply a day time mask.

Parameters:

sstream (containers.SiderealStream) – Unmasked sidereal stack.

Returns:

mstream – Masked sidereal stream.

Return type:

containers.SiderealStream

class draco.analysis.flagging.FindBeamformedOutliers[source]

Bases: SingleTask

Identify beamformed visibilities that deviate from our expectation for noise.

nsigma

Beamformed visibilities whose magnitude is greater than nsigma times the expected standard deviation of the noise, given by sqrt(1 / weight), will be masked.

Type:

float

window

If provided, the outlier mask will be extended to cover neighboring pixels. This list provides the number of pixels in each dimension that a single outlier will mask. Only supported for RingMap containers, where the list should be length 2 with [nra, nel], and FormedBeamHA containers, where the list should be length 1 with [nha,].

Type:

list of int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Create a mask that indicates outlier beamformed visibilities.

Parameters:

data (FormedBeam, FormedBeamHA, or RingMap) – Beamformed visibilities.

Returns:

out – Container with a boolean mask where True indicates outlier beamformed visibilities.

Return type:

FormedBeamMask, FormedBeamHAMask, or RingMapMask

class draco.analysis.flagging.MaskBadGains[source]

Bases: SingleTask

Get a mask of regions with bad gain.

Assumes that bad gains are set to 1.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Generate a time-freq mask.

Parameters:

data (andata.Corrdata or container.ContainerBase with a gain dataset) – Data containing the gains to be flagged. Must have a gain dataset.

Returns:

mask – Time-freq mask

Return type:

RFIMask container

class draco.analysis.flagging.MaskBaselines[source]

Bases: SingleTask

Mask out baselines from a dataset.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

mask_long_ns

Mask out baselines longer than a given distance in the N/S direction.

Type:

float, optional

mask_short

Mask out baselines shorter than a given distance.

Type:

float, optional

mask_short_ew

Mask out baselines shorter then a given distance in the East-West direction. Useful for masking out intra-cylinder baselines for North-South oriented cylindrical telescopes.

Type:

float, optional

mask_short_ns

Mask out baselines shorter then a given distance in the North-South direction.

Type:

float, optional

missing_threshold

Mask any baseline that is missing more than this fraction of samples. This is measured relative to other baselines.

Type:

float, optional

zero_data

Zero the data in addition to modifying the noise weights (default is False).

Type:

bool, optional

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:

{“all”, “none”, “vis”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(ss)[source]

Apply the mask to data.

Parameters:

ss (SiderealStream or TimeStream) – Data to mask. Applied in place.

setup(telescope)[source]

Set the telescope model.

Parameters:

telescope (TransitTelescope) – The telescope object to use

draco.analysis.flagging.MaskBeamformedOutliers

alias of ApplyGenericMask

class draco.analysis.flagging.MaskBeamformedWeights[source]

Bases: SingleTask

Mask beamformed visibilities with anomalously large weights before stacking.

nmed

Any weight that is more than nmed times the median weight over all objects and frequencies will be set to zero. Default is 8.0.

Type:

float

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Mask large weights.

Parameters:

data (FormedBeam) – Beamformed visibilities.

Returns:

data – The input container with the weight dataset set to zero if the weights exceed the threshold.

Return type:

FormedBeam

draco.analysis.flagging.MaskData

alias of MaskMModeData

class draco.analysis.flagging.MaskFreq[source]

Bases: SingleTask

Make a mask for certain frequencies.

bad_freq_ind

A list containing frequencies to flag out. Each entry can either be an integer giving an individual frequency index to remove, or 2-tuples giving start and end indices of a range to flag (as with a standard slice, the end is not included.)

Type:

list, optional

factorize

Find the smallest factorizable mask of the time-frequency axis that covers all samples already flagged in the data.

Type:

bool, optional

all_time

Only include frequencies where all time samples are present.

Type:

bool, optional

mask_missing_data

Mask time-freq samples where some baselines (for visibily data) or polarisations/elevations (for ring map data) are missing.

Type:

bool, optional

freq_frac

Fully mask any frequency where the fraction of unflagged samples is less than this value. Default is None.

Type:

float, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: VisContainer | RingMap) RFIMask | SiderealRFIMask[source]

Make the mask.

Parameters:

data – The data to mask.

Returns:

Frequency mask container

Return type:

mask_cont

class draco.analysis.flagging.MaskMModeData[source]

Bases: SingleTask

Mask out mmode data ahead of map making.

auto_correlations

Exclude auto correlations if set (default=False).

Type:

bool

m_zero

Ignore the m=0 mode (default=False).

Type:

bool

positive_m

Include positive m-modes (default=True).

Type:

bool

negative_m

Include negative m-modes (default=True).

Type:

bool

mask_low_m

If set, mask out m’s lower than this threshold.

Type:

int, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(mmodes)[source]

Mask out unwanted datain the m-modes.

Parameters:

mmodes (containers.MModes) – Mmode container to mask

Returns:

mmodes – Same object as input with masking applied

Return type:

containers.MModes

class draco.analysis.flagging.RFIMask[source]

Bases: SingleTask

Crappy RFI masking.

sigma

The false positive rate of the flagger given as sigma value assuming the non-RFI samples are Gaussian.

Type:

float, optional

tv_fraction

Number of bad samples in a digital TV channel that cause the whole channel to be flagged.

Type:

float, optional

stack_ind

Which stack to process to derive flags for the whole dataset.

Type:

int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sstream: SiderealStream) SiderealRFIMask[source]
process(sstream: TimeStream) RFIMask

Apply a day time mask.

Parameters:

sstream – Unmasked sidereal or time stream visibility data.

Returns:

The derived RFI mask.

Return type:

mask

class draco.analysis.flagging.RFIMaskChisqHighDelay[source]

Bases: SingleTask

Mask frequencies and times with anomalous chi-squared test statistic.

flag_ew

If the input container has an east-west baseline axis, then this flag will be applied to the weights before collapsing over that axis.

Type:

array

reg_arpls

Smoothness regularisation used when estimating the baseline for flagging bad frequencies. Default is 1e5.

Type:

float

nsigma_1d

Mask any frequency where the median over unmasked time samples deviates from the baseline by more than this number of median absolute deviations. Default is 5.0.

Type:

float

win_t

Size of the window (in number of time samples) used to compute a median filtered version of the test statistic.

Type:

float

win_f

Size of the window (in number of frequency channels) used to compute a median filtered version of the test statistic.

Type:

float

nsigma_2d

Mask any frequency and time where the absolute deviation from the median filtered version is greater than this number of expected standard deviations given the number of degrees of freedom (i.e., number of baselines).

Type:

float

estimate_var

Estimate the variance in the test statistic using the median absolute deviation over a region defined by the win_t and win_f parameters.

Type:

bool

only_positive

Only mask large postive excursions in the test statistic, leaving large negative excursions unmasked.

Type:

bool

separate_pol

If true, construct a mask for each pol separately. If false, sum the chi-squared values over all polarisations and construct a single mask.

Type:

bool

mask_type

Algorithm to use to generate the mask.

Type:

{“mad”|”sumthreshold”}

niter

Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration.

Type:

int, optional

rho

Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations.

Type:

float, optional

max_m

Maximum size of the SumThreshold window to use.

Type:

int, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

mask_1d(y, m)[source]

Mask frequency channels where median chi-squared deviates from neighbors.

Parameters:
  • y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.

  • m (np.ndarray[nfreq, ntime]) – Boolean mask that indicates which samples to ignore when calculating the median over time.

Returns:

mask – Boolean mask that indicates frequency channels where the median chi-squared over time deviates significantly from that of the neighboring channels.

Return type:

np.ndarray[nfreq]

mask_2d(y, w)[source]

Mask frequencies and times where the chi-squared deviates from local median.

Parameters:
  • y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.

  • w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.

Returns:

mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.

Return type:

np.ndarray[nfreq]

mask_2d_sumthreshold(y, w)[source]

Iterative application of sumthreshold algorithm to mask large chi-squared.

Parameters:
  • y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.

  • w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.

Returns:

mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.

Return type:

np.ndarray[nfreq]

process(stream)[source]

Generate a mask from the data.

Parameters:

stream (dcontainers.TimeStream | dcontainers.SiderealStream |) – dcontainers.HybridVisStream | dcontainers.RingMap Container holding a chi-squared test statistic in the visibility dataset. A weighted average will be taken over any axis that is not time/ra or frequency.

Returns:

mask – dcontainers.RFIMaskByPol | dcontainers.SiderealRFIMaskByPol Time-frequency mask, where values marked True are flagged.

Return type:

dcontainers.RFIMask | dcontainers.SiderealRFIMask |

setup(telescope=None)[source]

Save telescope object for time calculations.

Only used to convert (LSD, RA) to unix time when masking sidereal streams. Not required when masking time streams.

Parameters:

telescope (TransitTelescope) – Telescope object used for time calculations.

class draco.analysis.flagging.RFISensitivityMask[source]

Bases: SingleTask

Identify RFI as deviations in system sensitivity from expected radiometer noise.

mask_type

One of ‘mad’, ‘sumthreshold’ or ‘combine’. Default is combine, which uses the sumthreshold everywhere except around the transits of the sun and bright point sources, where it applies the MAD mask to avoid masking out the transits.

Type:

string, optional

include_pol

The list of polarisations to include. Default is to use all polarisations.

Type:

list of strings, optional

nsigma_1d

Construct a static mask by identifying any frequency channel whose quantile over time deviates from the median over frequency by more than this number of median absolute deviations. Default: 5.0

Type:

float, optional

quantile_1d

The quantile to use along time to construct the static mask. Default: 0.15

Type:

float, optional

win_f_1d

Number of frequency channels used to calculate a rolling median and median absolute deviation for the staic mask. Default: 191

Type:

int, optional

nsigma

The final threshold for the MAD, TV, and SumThreshold algorithms given as number of standard deviations. Default: 5.0

Type:

float, optional

niter

Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration. Default: 5

Type:

int, optional

rho

Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations. Default: 1.5

Type:

float, optional

base_size

The size of the region used to estimate the baseline, provided as (number of frequency channels, number of time samples). Default: (37, 181)

Type:

[int, int]

mad_size

The size of the region used to estimate the standard deviation, provided (number of frequency channels, number of time samples). Default: (101, 31)

Type:

[int, int]

tv_fraction

Fraction of bad samples in a digital TV channel that cause the whole channel to be flagged. Default: 0.5

Type:

float, optional

max_m

Maximum size of the SumThreshold window to use. Default: 64

Type:

int, optional

sir

Apply scale invariant rank (SIR) operator on top of final mask. Default: False

Type:

bool, optional

eta

Aggressiveness of the SIR operator. With eta=0, no additional samples are flagged and with eta=1, all samples will be flagged. Default: 0.2

Type:

float optional

only_time

Only apply the SIR operator along the time axis. Default: False

Type:

bool, optinal

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sensitivity)[source]

Derive an RFI mask from sensitivity data.

Parameters:

sensitivity (containers.SystemSensitivity) – Sensitivity data.

Returns:

rfimask – RFI mask derived from sensitivity.

Return type:

containers.RFIMask

setup()[source]

Define the threshold as a function of iteration.

class draco.analysis.flagging.RFIStokesIMask[source]

Bases: ReduceVar

Two-stage RFI filter based on Stokes I visibilities.

Tries to independently target transient and persistant RFI.

Stage 1 is applied to each frequency independently. A high-pass filter is applied in RA to isolate transient RFI. The high-pass filtered visibilities are beamformed, and a MAD filter is applied to the resulting map. A time/RA sample is then flagged if some fraction of beams exceed the MAD threshold for that sample.

Stage 2 is applied across frequencies. A low-pass filter is applied in RA to reduce transient sky sources. The average visibility power is taken over 2+ cylinder separation baselines to obtain a single 1D array per frequency. These powers are gathered across all frequencies and a basic background subtraction is applied. Sumthreshold algorithm is then used for flagging, with a variance estimate used to boost the expected noise during the daytime and bright point source transits.

mad_base_size

Median absolute deviations base window. Default is [1, 101].

Type:

list of int, optional

mad_dev_size

Median absolute deviation median deviation window. Default is [1, 51].

Type:

list of int, optional

sigma_high

Median absolute deviations sigma threshold. Default is 8.0.

Type:

float, optional

sigma_low

Median absolute deviations low sigma threshold. A value above this threshold is masked only if it is either larger than sigma_high or it is larger than sigma_low AND connected to a region larger than sigma_high. Default is 2.0.

Type:

float, optional

frac_samples

Fraction of flagged samples in map space above which the entire time sample will be flagged. Default is 0.01.

Type:

float, optional

max_m

Maximum size of the SumThreshold window. Default is 64.

Type:

int, optional

nsigma

Initial threshold for SumThreshold. Default is 5.0.

Type:

float, optional

solar_var_boost

Variance boost during solar transit. Default is 1e4.

Type:

float, optional

bg_win_size

The size of the window used to estimate the background sky, provided as (number of frequency channels, number of time samples). Default is [11, 3].

Type:

list, optional

var_win_size

The size of the window used when estimating the variance, provided as (number of frequency channels, number of time samples). Default is [3, 31].

Type:

list, optional

lowpass_cutoff

Angular cutoff of the ra lowpass filter. Default is 7.5, which corresponds to about 30 minutes of observation time.

Type:

float, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

static apply_filter(vis, weight, samples, fcut, type_='high')[source]

Apply a high-pass or low-pass mmode filter.

mask_multi_channel(power, mask, times)[source]

Mask slow-moving narrow-band RFI.

mask_single_channel(vis, weight, mask, freq, baselines, ra)[source]

Mask scattered rfi.

process(stream)[source]

Make a mask from the data.

Parameters:

stream (dcontainers.TimeStream | dcontainers.SiderealStream) – Data to use when masking. Axes should be frequency, stack, and time-like.

Returns:

  • mask (dcontainers.RFIMask | dcontainers.SiderealRFIMask) – Time-frequency mask, where values marked True are flagged.

  • power (dcontainers.TimeStream | dcontainers.SiderealStream) – Time-frequency power metric used in second-stage flagging.

setup(telescope)[source]

Set up the baseline selections and ordering.

Parameters:

telescope (TransitTelescope) – The telescope object to use

class draco.analysis.flagging.RadiometerWeight[source]

Bases: SingleTask

Update vis_weight according to the radiometer equation.

\[\text{weight}_{ij} = N_\text{samp} / V_{ii} V_{jj}\]
replace

Replace any existing weights (default). If False then we multiply the existing weights by the radiometer values.

Type:

bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream)[source]

Change the vis weight.

Parameters:

stream (SiderealStream or TimeStream) – Data to be weighted. This is done in place.

Returns:

stream

Return type:

SiderealStream or TimeStream

class draco.analysis.flagging.ReduceMaskEl[source]

Bases: SingleTask

Reduce the ‘el’ axis from input classes and produce corresponding reduced output classes.

Reduction algorithm: If the number of True values in the mask along the el axis is higher than a given threshold, set the mask to True.

threshold
Type:

int

This number determines the minimum number of detected RFI events along the el axis required for a data point
to be included in the reduced mask. Default is 1.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(rfimask)[source]

Produce a RFI mask.

Parameters:

rfimask (containers.LocalizedRFIMask(freq, el, time) or containers.SiderealLocalizedRFIMask(freq, ra, el)) – El-specific RFI mask indicating channels that are free from RFI events.

Returns:

out – Non el-specific RFI mask indicating channels that are free from RFI events.

Return type:

containers.RFIMask(freq, time) or containers.SiderealRFIMask(freq, ra)

class draco.analysis.flagging.SanitizeWeights[source]

Bases: SingleTask

Flags weights outside of a valid range.

Flags any weights above a max threshold and below a minimum threshold. Baseline dependent, so only some baselines may be flagged.

max_thresh

largest value to keep

Type:

float

min_thresh

smallest value to keep

Type:

float

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Mask any weights outside of the threshold range.

Parameters:

data (andata.CorrData or containers.VisContainer object) – Data containing the weights to be flagged

Returns:

data – Data object with high/low weights masked in-place

Return type:

same object as data

setup()[source]

Validate the max and min values.

Raises:

ValueError – if min_thresh is larger than max_thresh

class draco.analysis.flagging.SiderealMaskConversion[source]

Bases: SingleTask

Convert the axis of an RFI mask from time to ra.

The conversion is performed by mapping values between Unix time and LSA using the geographic location of the telescope, as provided by the Observer object.

spread_size

The number of cells to flag before and after a detected true value. This ensures conservative flagging, preventing missed detections due to axis alignment issues. Default is 1.

Type:

int

npix
The number of pixels used to cover the full RA range from 0 to 360.

Defualt is 4096.

Type:

int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(rfimask)[source]

Produce a RFI mask.

Parameters:

rfimask (containers.LocalizedRFIMask) – Container for holding a mask indicating channels that are free from RFI events. Its axes are freq, el, and time.

Returns:

out – Boolean mask that can be applied to a ringmap with the task ApplyLocalizedRFIMask to mask contaminated samples. Its axes are freq, ra, and el.

Return type:

containers.LocalizedSiderealRFIMask

setup(manager)[source]

Set the local observers position.

Parameters:

manager (Observer) – An Observer object holding the geographic location of the telescope. Note that TransitTelescope instances are also Observers.

class draco.analysis.flagging.SmoothVisWeight[source]

Bases: SingleTask

Smooth the visibility weights with a median filter.

This is done in-place.

kernel_size

Size of the kernel for the median filter in time points. Default is 31, corresponding to ~5 minutes window for 10s cadence data.

Type:

int, optional

mask_zeros

Mask out zero-weight entries when taking the moving weighted median.

Type:

bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: TimeStream) TimeStream[source]

Smooth the weights with a median filter.

Parameters:

data – Data containing the weights to be smoothed

Returns:

Data object containing the same data as the input, but with the weights substituted by the smoothed ones.

Return type:

data

class draco.analysis.flagging.ThresholdVisWeightBaseline[source]

Bases: SingleTask

Form a mask corresponding to weights that are below some threshold.

The threshold is determined as maximum(absolute_threshold, relative_threshold * average(weight)) and is evaluated per product/stack entry. The user can specify whether to use a mean or median as the average, but note that the mean is much more likely to be biased by anomalously high- or low-weight samples (both of which are present in raw CHIME data). The user can also specify that weights below some threshold should not be considered when taking the average and constructing the mask (the default is to only ignore zero-weight samples).

The task outputs a BaselineMask or SiderealBaselineMask depending on the input container.

Parameters:
  • average_type (string, optional) – Type of average to use (“median” or “mean”). Default: “median”.

  • absolute_threshold (float, optional) – Any weights with values less than this number will be set to zero. Default: 1e-7.

  • relative_threshold (float, optional) – Any weights with values less than this number times the average weight will be set to zero. Default: 1e-6.

  • ignore_absolute_threshold (float, optional) – Any weights with values less than this number will be ignored when taking averages and constructing the mask. Default: 0.0.

  • pols_to_flag (string, optional) – Which polarizations to flag. “copol” only flags XX and YY baselines, while “all” flags everything. Default: “all”.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream) BaselineMask | SiderealBaselineMask[source]

Construct baseline-dependent mask.

Parameters:

stream (.core.container with weight attribute) – Input container whose weights are used to construct the mask.

Returns:

out – The output baseline-dependent mask.

Return type:

BaselineMask or SiderealBaselineMask

setup(telescope)[source]

Set the telescope model.

Parameters:

telescope (TransitTelescope) – The telescope object to use

class draco.analysis.flagging.ThresholdVisWeightFrequency[source]

Bases: SingleTask

Create a mask to remove all weights below a per-frequency threshold.

A single relative threshold is set for each frequency along with an absolute minimum weight threshold. Masking is done relative to the mean baseline.

Parameters:
  • absolute_threshold (float) – Any weights with values less than this number will be set to zero.

  • relative_threshold (float) – Any weights with values less than this number times the average weight will be set to zero.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream)[source]

Make a baseline-independent mask.

Parameters:

stream (.core.container with weight attribute) – Container to mask

Returns:

out – RFIMask container with mask set

Return type:

RFIMask container

draco.analysis.flagging.complex_med(x, *args, **kwargs)[source]

Complex median, done by applying to the real/imag parts individually.

Parameters:
  • x (np.ndarray) – Array to apply to.

  • *args (list, dict) – Passed straight through to np.nanmedian

  • **kwargs (list, dict) – Passed straight through to np.nanmedian

Returns:

m – Median.

Return type:

np.ndarray

draco.analysis.flagging.destripe(x, w, axis=1)[source]

Subtract the median along a specified axis.

Parameters:
  • x (np.ndarray) – Array to destripe.

  • w (np.ndarray) – Mask array for points to include (True) or ignore (False).

  • axis (int, optional) – Axis to apply destriping along.

Returns:

y – Destriped array.

Return type:

np.ndarray

draco.analysis.flagging.inverse_binom_cdf_prob(k, N, F)[source]

Calculate the trial probability that gives the CDF.

This gets the trial probability that gives an overall cumulative probability for Pr(X <= k; N, p) = F

Parameters:
  • k (int) – Maximum number of successes.

  • N (int) – Total number of trials.

  • F (float) – The cumulative probability for (k, N).

Returns:

p – The trial probability.

Return type:

float

draco.analysis.flagging.mad(x, mask, base_size=(11, 3), mad_size=(21, 21), debug=False, sigma=True)[source]

Calculate the MAD of freq-time data.

Parameters:
  • x (np.ndarray) – Data to filter.

  • mask (np.ndarray) – Initial mask.

  • base_size (tuple) – Size of the window to use in (freq, time) when estimating the baseline.

  • mad_size (tuple) – Size of the window to use in (freq, time) when estimating the MAD.

  • debug (bool, optional) – If True, return deviation and mad arrays as well

  • sigma (bool, optional) – Rescale the output into units of Gaussian sigmas.

Returns:

mad – Size of deviation at each point in MAD units. This output may contain NaN’s for regions of missing data.

Return type:

np.ndarray

draco.analysis.flagging.medfilt(x, mask, size, *args)[source]

Apply a moving median filter to masked data.

The application is done by iterative filling to overcome the fact we don’t have an actual implementation of a nanmedian.

Parameters:
  • x (np.ndarray) – Data to filter.

  • mask (np.ndarray) – Mask of data to filter out.

  • size (tuple) – Size of the window in each dimension.

  • args (optional) – Additional arguments to pass to the moving weighted median

Returns:

y – The masked data. Data within the mask is undefined.

Return type:

np.ndarray

draco.analysis.flagging.p_to_sigma(p)[source]

Get the sigma exceeded by the tails of a Gaussian with probability p.

draco.analysis.flagging.sigma_to_p(sigma)[source]

Get the probability of an excursion larger than sigma for a Gaussian.

draco.analysis.flagging.tv_channels_flag(x, freq, sigma=5, f=0.5, debug=False)[source]

Perform a higher sensitivity flagging for the TV stations.

This flags a whole TV station band if more than fraction f of the samples within a station band exceed a given threshold. The threshold is calculated by wanting a fixed false positive rate (as described by sigma) for fraction f of samples exceeding the threshold

Parameters:
  • x (np.ndarray[freq, time]) – Deviations of data in sigma units.

  • freq (np.ndarray[freq]) – Frequency of samples in MHz.

  • sigma (float, optional) – The probability of a false positive given as a sigma of a Gaussian.

  • f (float, optional) – Fraction of bad samples within each channel before flagging the whole thing.

  • debug (bool, optional) – Returns (mask, fraction) instead to give extra debugging info.

Returns:

mask – Mask of the input data.

Return type:

np.ndarray[bool]