draco.analysis.flagging

Tasks for flagging out bad or unwanted data.

This includes data quality flagging on timestream data; sun excision on sidereal data; and pre-map making flagging on m-modes.

The convention for flagging/masking is True for contaminated samples that should be excluded and False for clean samples.

Functions

`complex_med`(x, args, *kwargs)	Complex median, done by applying to the real/imag parts individually.
`destripe`(x, w[, axis])	Subtract the median along a specified axis.
`inverse_binom_cdf_prob`(k, N, F)	Calculate the trial probability that gives the CDF.
`mad`(x, mask[, base_size, mad_size, debug, sigma])	Calculate the MAD of freq-time data.
`medfilt`(x, mask, size, *args)	Apply a moving median filter to masked data.
`p_to_sigma`(p)	Get the sigma exceeded by the tails of a Gaussian with probability p.
`sigma_to_p`(sigma)	Get the probability of an excursion larger than sigma for a Gaussian.
`tv_channels_flag`(x, freq[, sigma, f, debug])	Perform a higher sensitivity flagging for the TV stations.

Classes

`ApplyBaselineMask`()	Apply a distributed mask that varies across baselines.
`ApplyGenericMask`()	Apply a mask to a dataset with arbitrary axes.
`ApplyLocalizedRFIMask`()	Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights.
`ApplyRFIMask`	alias of `ApplyTimeFreqMask`
`ApplyTaper`()	Apply a taper to a dataset with arbitrary axes.
`ApplyTimeFreqMask`()	Apply a time-frequency mask to the data.
`BlendStack`()	Mix a small amount of a stack into data to regularise RFI gaps.
`CollapseBaselineMask`()	Collapse a baseline-dependent mask along the baseline axis.
`CombineMasks`()	Combine an arbitrary number of masks conservatively (logical OR).
`CombineTapers`()	Combine an arbitrary number of tapers conservatively (multiply).
`DayMask`()	Crudely simulate a masking out of the daytime data.
`FindBeamformedOutliers`()	Identify beamformed visibilities that deviate from our expectation for noise.
`GeneralCombineMasks`()	Combine multiple masks using a user-specified logical expression.
`GeneralCombineTapers`()	Combine multiple taper functions using a user-defined expression.
`InterpolateRFIMaskNearest`()	Align the time axis of an RFI mask to a target data stream.
`MaskBadGains`()	Get a mask of regions with bad gain.
`MaskBaselines`()	Mask out baselines from a dataset.
`MaskBeamformedOutliers`	alias of `ApplyGenericMask`
`MaskBeamformedWeights`()	Mask beamformed visibilities with anomalously large weights before stacking.
`MaskData`	alias of `MaskMModeData`
`MaskFreq`()	Make a mask for certain frequencies.
`MaskFromTaper`()	Generate a binary mask from a taper.
`MaskMModeData`()	Mask out mmode data ahead of map making.
`RFIMask`()	Crappy RFI masking.
`RFIMaskChisqHighDelay`()	Mask frequencies and times with anomalous chi-squared test statistic.
`RFISensitivityMask`()	Identify RFI as deviations in system sensitivity from expected radiometer noise.
`RFIStokesIMask`()	Two-stage RFI filter based on Stokes I visibilities.
`RadiometerWeight`()	Update vis_weight according to the radiometer equation.
`ReduceMaskEl`()	Reduce the 'el' axis from input classes and produce corresponding reduced output classes.
`SanitizeWeights`()	Flags weights outside of a valid range.
`SiderealMaskConversion`()	Convert the axis of an RFI mask from time to ra.
`SmoothVisWeight`()	Smooth the visibility weights with a median filter.
`TaperDelayTransform`()	Apply a taper or mask to a DelayTransform container.
`ThresholdVisWeightBaseline`()	Form a mask corresponding to weights that are below some threshold.
`ThresholdVisWeightFrequency`()	Create a mask to remove all weights below a per-frequency threshold.

class draco.analysis.flagging.ApplyBaselineMask[source]

Bases: SingleTask

Apply a distributed mask that varies across baselines.

No broadcasting is done, so the data and mask should have the same axes. This shouldn’t be used for non-distributed time-freq masks.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:: {“all”, “none”, “vis”, “map”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: TimeStream, mask: BaselineMask) → TimeStream[source]

process(data: SiderealStream, mask: SiderealBaselineMask) → SiderealStream

Flag data by zeroing the weights.

Parameters:

data – Data to apply mask to. Must have a stack axis
mask – A baseline-dependent mask

Returns:

The masked data. Masking is done in place.

Return type:

data

class draco.analysis.flagging.ApplyGenericMask[source]

Bases: SingleTask

Apply a mask to a dataset with arbitrary axes.

All of the mask axes must be present in the dataset, but the dataset can have additional axes.

Assumes that a sample marked True in the mask dataset should be flagged.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: ContainerBase, mask: ContainerBase)[source]

Apply the mask to the dataset weights.

Reorder the mask axes and add broadcasting axes if necessary.

Parameters:

data – Any container with a frequency axis.
mask – Any container whose axes are a subset of the axes in data

Returns:

The input container with the weight dataset set to zero for masked samples.

Return type:

data

class draco.analysis.flagging.ApplyLocalizedRFIMask[source]

Bases: SingleTask

Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights.

This class extends the class ApplyTimeFreqMask to include el in addition to freq and ra, and can be further extended for a new RingMap class (freq,el,time). Note that while the ra and el axes of the tstream and mask datasets do not need to be identical, they must have overlapping regions. However, their freq axes must be identical.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:: {“all”, “none”, “map”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(tstream, rfimask)[source]

Apply the mask by zeroing the weights.

Parameters:

tstream (containers.RingMap) – A data container with axes (pol, freq, ra, el).
rfimask (containers.LocalizedSiderealRFIMask(freq, ra, el)) – An RFI mask with overlapping freq, ra and el regions with the tstream, containers.RingMap.

Returns:

tstream – The masked RingMap with weights modified in overlapping regions. Note that the masking is done in place.

Return type:

containers.RingMap

draco.analysis.flagging.ApplyRFIMask: alias of ApplyTimeFreqMask

class draco.analysis.flagging.ApplyTaper[source]

Bases: SingleTask

Apply a taper to a dataset with arbitrary axes.

All of the taper axes must be present in the dataset, but the dataset can have additional axes.

update_weight

If set to True, the taper will be applied to the weight dataset using the standard equation for propagation of uncertainty.

Type:: bool

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: ContainerBase, taper: ContainerBase)[source]

Apply the taper to the dataset weights.

Reorder the taper axes and add broadcasting axes if necessary.

Parameters:

data (containers.DataWeightContainer) – A container with data and weight properties. Both the data and weight must include a freq axis, and must contain all axes present in the taper.
taper (containers.ContainerBase) – Any container that has a taper property that has a freq axis and whose othes axes are a subset of those in the data.

Returns:

data – The input container, with the data property scaled by the taper, and optionally the weight scaled appropriately.

Return type:

containers.DataWeightContainer

class draco.analysis.flagging.ApplyTimeFreqMask[source]

Bases: SingleTask

Apply a time-frequency mask to the data.

Typically this is used to mask out all inputs at times and frequencies contaminated by RFI.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:: {“all”, “none”, “vis”, “map”}

collapse_pol

Take the logical OR of the mask along the polarisation axis prior to applying it to the data. In other words, mask a frequency and time in all polarisations if it was identified as contaminated in any polarisation.

Type:: bool

match_axes

If True (default), the rfimask and tstream must have identical time-like axis. Otherwise, the mask is applied only to the overlapping region of the time-like axis. Non-overlapping regions remain unchanged. Samples must still have the same RA or timestamp values in overlapping regions.

Type:: bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(tstream, rfimask)[source]

Apply the mask by zeroing the weights.

Parameters:

tstream (timestream or sidereal stream) – A timestream or sidereal stream like container. For example, containers.TimeStream, andata.CorrData or containers.SiderealStream.
rfimask (containers.RFIMask, containers.RFIMaskByPol,) – containers.SiderealRFIMask, containers.SiderealRFIMaskByPol An RFI mask for the same period of time.

Returns:

tstream – The masked timestream. Note that the masking is done in place.

Return type:

timestream or sidereal stream

class draco.analysis.flagging.BlendStack[source]

Bases: SingleTask

Mix a small amount of a stack into data to regularise RFI gaps.

This is designed to mix in a small amount of a stack into a day of data (which will have RFI masked gaps) to attempt to regularise operations which struggle to deal with time variable masks, e.g. DelaySpectrumEstimator.

frac

The relative weight to give the stack in the average. This multiplies the weights already in the stack, and so it should be remembered that these may already be significantly higher than the single day weights.

Type:: float, optional

match_median

Estimate the median in the time/RA direction from the common samples and use this to match any quasi time-independent bias of the data (e.g. cross talk).

Type:: bool, optional

subtract

Rather than taking an average, instead subtract out the blending stack from the input data in the common samples to calculate the difference between them. The interpretation of frac is a scaling of the inverse variance of the stack to an inverse variance of a prior on the difference, e.g. a frac = 1e-4 means that we expect the standard deviation of the difference between the data and the stacked data to be 100x larger than the noise of the stacked data.

Type:: bool, optional

mask_freq

Maintain masking if a frequency is entirely flagged - i.e., even if blending data exists in those bands, do not blend.

Type:: bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Blend a small amount of the stack into the incoming data.

Parameters:: data (SiderealStream, RingMap,or HybridVisStream) – The data to be blended into. This is modified in place.
Returns:: data_blend – The modified data. This is the same object as the input, and it has been modified in place.
Return type:: SiderealStream, RingMap,or HybridVisStream

setup(data_stack)[source]

Set the stacked data.

Parameters:: data_stack (SiderealStream, RingMap,or HybridVisStream) – Data stack to blend

class draco.analysis.flagging.CollapseBaselineMask[source]

Bases: SingleTask

Collapse a baseline-dependent mask along the baseline axis.

The output is a frequency/time mask that is True for any freq/time sample for which any baseline is masked in the input mask.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(baseline_mask: BaselineMask | SiderealBaselineMask) → RFIMask | SiderealRFIMask[source]

Collapse input mask over baseline axis.

Parameters:: baseline_mask (BaselineMask or SiderealBaselineMask) – Input baseline-dependent mask
Returns:: mask_cont – Output baseline-independent mask.
Return type:: RFIMask or SiderealRFIMask

class draco.analysis.flagging.CombineMasks[source]

Bases: GeneralCombineMasks

Combine an arbitrary number of masks conservatively (logical OR).

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(masks: list[ContainerBase])[source]

Construct the logical OR of all masks.

Parameters:: masks (list of containers.ContainerBase) – A list of containers with a mask dataset, all of the same type and shape.
Returns:: combined_mask – A new container of the same type containing the logical OR of all masks.
Return type:: containers.ContainerBase

class draco.analysis.flagging.CombineTapers[source]

Bases: GeneralCombineTapers

Combine an arbitrary number of tapers conservatively (multiply).

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(tapers: list[ContainerBase])[source]

Construct the product of all tapers.

Parameters:: tapers (list of containers.ContainerBase) – A list of containers with a taper dataset, all of the same type and shape.
Returns:: combined_taper – A new container of the same type containing the product of all tapers.
Return type:: containers.ContainerBase

class draco.analysis.flagging.DayMask[source]

Bases: SingleTask

Crudely simulate a masking out of the daytime data.

start, end

Start and end of masked out region.

Type:: float

width

Use a smooth transition of given width between the fully masked and unmasked data. This is interior to the region marked by start and end.

Type:: float

zero_data

Zero the data in addition to modifying the noise weights (default is True).

Type:: bool, optional

remove_average

Estimate and remove the mean level from each visibilty. This estimate does not use data from the masked region.

Type:: bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sstream)[source]

Apply a day time mask.

Parameters:: sstream (containers.SiderealStream) – Unmasked sidereal stack.
Returns:: mstream – Masked sidereal stream.
Return type:: containers.SiderealStream

class draco.analysis.flagging.FindBeamformedOutliers[source]

Bases: SingleTask

Identify beamformed visibilities that deviate from our expectation for noise.

nsigma

Beamformed visibilities whose magnitude is greater than nsigma times the expected standard deviation of the noise, given by sqrt(1 / weight), will be masked.

Type:: float

window

If provided, the outlier mask will be extended to cover neighboring pixels. This list provides the number of pixels in each dimension that a single outlier will mask. Only supported for RingMap containers, where the list should be length 2 with [nra, nel], and FormedBeamHA containers, where the list should be length 1 with [nha,].

Type:: list of int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Create a mask that indicates outlier beamformed visibilities.

Parameters:: data (FormedBeam, FormedBeamHA, or RingMap) – Beamformed visibilities.
Returns:: out – Container with a boolean mask where True indicates outlier beamformed visibilities.
Return type:: FormedBeamMask, FormedBeamHAMask, or RingMapMask

class draco.analysis.flagging.GeneralCombineMasks[source]

Bases: SingleTask

Combine multiple masks using a user-specified logical expression.

The input is a list of containers with mask datasets. Each mask is assigned a variable name (A, B, C, …, Z) in the order they appear. The logical combination is defined using a Python expression involving those variables.

For example, if masks = [m1, m2], then the expression “A & ~B” would keep values that are masked in m1 and not in m2.

expression

A Python expression combining the mask variables. Variables must be uppercase letters A, B, …, matching the order of the input masks. The expression must evaluate to a boolean array of the same shape.

Type:: str

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(masks: list[ContainerBase])[source]

Combine the given list of masks using the logical expression.

Parameters:: masks (list of containers.ContainerBase) – A list of containers with a mask dataset, all of the same type and shape.
Returns:: combined_mask – A new container of the same type with the result of the logical combination.
Return type:: containers.ContainerBase

class draco.analysis.flagging.GeneralCombineTapers[source]

Bases: GeneralCombineMasks

Combine multiple taper functions using a user-defined expression.

This is a subclass of GeneralCombineMasks that operates on the taper dataset rather than mask. Each input taper is assigned a variable (A, B, C, …, Z) in the order they appear. The combination is defined by the expression property, which is evaluated using standard Python syntax.

For example, an expression like “A * B” multiplies two taper functions elementwise.

expression

A Python expression combining the taper datasets from each input container using variable names A, B, etc.

Type:: str

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

class draco.analysis.flagging.InterpolateRFIMaskNearest[source]

Bases: SingleTask

Align the time axis of an RFI mask to a target data stream.

This task adjusts the time axis of an RFI mask to match the time axis of a target dataset such as a TimeStream or SystemSensitivity. This is useful when the original mask and the target data stream do not have exactly matching time axes.

The alignment is performed using nearest-interpolation in time, and the RFI mask is expanded along the time axis based on its spread_size to ensure conservative flagging.

spread_size

Time spreading factor for conservative flagging. Each flagged time sample is expanded to neighboring target time values that fall within spread_size times the time resolution of the input mask. If the time axes of the input mask and target align exactly, spreading is automatically disabled by setting this to zero. Default is 1.0.

Type:: float

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(rfimask)[source]

Align the RFI mask’s time axis to match the target dataset.

Parameters:: rfimask (containers.LocalizedRFIMask or containers.RFIMask) – The original RFI mask to be realigned.
Returns:: out – The RFI mask with its time axis aligned to the reference time axis.
Return type:: containers.RFIMask or containers.LocalizedRFIMask

setup(tstream)[source]

Set the target time axis from the data container.

This sets the reference time axis to which the RFI mask will be aligned.

Parameters:: tstream (containers.TimeStream, SystemSensitivity, etc.) – A time-like data container that provides the target time axis.

class draco.analysis.flagging.MaskBadGains[source]

Bases: SingleTask

Get a mask of regions with bad gain.

Assumes that bad gains are set to 1.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Generate a time-freq mask.

Parameters:: data (andata.Corrdata or container.ContainerBase with a gain dataset) – Data containing the gains to be flagged. Must have a gain dataset.
Returns:: mask – Time-freq mask
Return type:: RFIMask container

class draco.analysis.flagging.MaskBaselines[source]

Bases: SingleTask

Mask out baselines from a dataset.

This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.

mask_long_ns

Mask out baselines longer than a given distance in the N/S direction.

Type:: float, optional

mask_short

Mask out baselines shorter than a given distance.

Type:: float, optional

mask_short_ew

Mask out baselines shorter then a given distance in the East-West direction. Useful for masking out intra-cylinder baselines for North-South oriented cylindrical telescopes.

Type:: float, optional

mask_short_ns

Mask out baselines shorter then a given distance in the North-South direction.

Type:: float, optional

missing_threshold

Mask any baseline that is missing more than this fraction of samples. This is measured relative to other baselines.

Type:: float, optional

zero_data

Zero the data in addition to modifying the noise weights (default is False).

Type:: bool, optional

share

Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.

Type:: {“all”, “none”, “vis”}

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(ss)[source]

Apply the mask to data.

Parameters:: ss (SiderealStream or TimeStream) – Data to mask. Applied in place.

setup(telescope)[source]

Set the telescope model.

Parameters:: telescope (TransitTelescope) – The telescope object to use

draco.analysis.flagging.MaskBeamformedOutliers: alias of ApplyGenericMask

class draco.analysis.flagging.MaskBeamformedWeights[source]

Bases: SingleTask

Mask beamformed visibilities with anomalously large weights before stacking.

nmed

Any weight that is more than nmed times the median weight over all objects and frequencies will be set to zero. Default is 8.0.

Type:: float

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Mask large weights.

Parameters:: data (FormedBeam) – Beamformed visibilities.
Returns:: data – The input container with the weight dataset set to zero if the weights exceed the threshold.
Return type:: FormedBeam

draco.analysis.flagging.MaskData: alias of MaskMModeData

class draco.analysis.flagging.MaskFreq[source]

Bases: SingleTask

Make a mask for certain frequencies.

bad_freq_ind

A list containing frequencies to flag out. Each entry can either be an integer giving an individual frequency index to remove, or 2-tuples giving start and end indices of a range to flag (as with a standard slice, the end is not included.)

Type:: list, optional

factorize

Find the smallest factorizable mask of the time-frequency axis that covers all samples already flagged in the data.

Type:: bool, optional

all_time

Only include frequencies where all time samples are present.

Type:: bool, optional

mask_missing_data

Mask time-freq samples where some baselines (for visibily data) or polarisations/elevations (for ring map data) are missing.

Type:: bool, optional

freq_frac

Fully mask any frequency where the fraction of unflagged samples is less than this value. Default is None.

Type:: float, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: VisContainer | RingMap) → RFIMask | SiderealRFIMask[source]

Make the mask.

Parameters:: data – The data to mask.
Returns:: Frequency mask container
Return type:: mask_cont

class draco.analysis.flagging.MaskFromTaper[source]

Bases: SingleTask

Generate a binary mask from a taper.

This task constructs a RingMapMask by thresholding a RingMapTaper. The resulting mask is True where the taper is either less than 1.0 or equal to 0.0, depending on the outer parameter.

outer

If True, mask all samples within the outer boundary of the taper (i.e., where the taper is < 1). If False, mask all samples within the inner boundary of the taper is (i.e., where the taper is 0).

Type:: bool

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(taper)[source]

Generate the mask from the taper.

Parameters:: taper (containers.RingMapTaper) – The taper used to generate the mask.
Returns:: out – The boolean mask that indicates where the taper is less than 1 (outer = True) or zero (outer = False).
Return type:: containers.RingMapMask

class draco.analysis.flagging.MaskMModeData[source]

Bases: SingleTask

Mask out mmode data ahead of map making.

auto_correlations

Exclude auto correlations if set (default=False).

Type:: bool

m_zero

Ignore the m=0 mode (default=False).

Type:: bool

positive_m

Include positive m-modes (default=True).

Type:: bool

negative_m

Include negative m-modes (default=True).

Type:: bool

mask_low_m

If set, mask out m’s lower than this threshold.

Type:: int, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(mmodes)[source]

Mask out unwanted datain the m-modes.

Parameters:: mmodes (containers.MModes) – Mmode container to mask
Returns:: mmodes – Same object as input with masking applied
Return type:: containers.MModes

class draco.analysis.flagging.RFIMask[source]

Bases: SingleTask

Crappy RFI masking.

sigma

The false positive rate of the flagger given as sigma value assuming the non-RFI samples are Gaussian.

Type:: float, optional

tv_fraction

Number of bad samples in a digital TV channel that cause the whole channel to be flagged.

Type:: float, optional

stack_ind

Which stack to process to derive flags for the whole dataset.

Type:: int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sstream: SiderealStream) → SiderealRFIMask[source]

process(sstream: TimeStream) → RFIMask

Apply a day time mask.

Parameters:: sstream – Unmasked sidereal or time stream visibility data.
Returns:: The derived RFI mask.
Return type:: mask

class draco.analysis.flagging.RFIMaskChisqHighDelay[source]

Bases: SingleTask

Mask frequencies and times with anomalous chi-squared test statistic.

flag_ew

If the input container has an east-west baseline axis, then this flag will be applied to the weights before collapsing over that axis.

Type:: array

reg_arpls

Smoothness regularisation used when estimating the baseline for flagging bad frequencies. Default is 1e5.

Type:: float

nsigma_1d

Mask any frequency where the median over unmasked time samples deviates from the baseline by more than this number of median absolute deviations. Default is 5.0.

Type:: float

win_t

Size of the window (in number of time samples) used to compute a median filtered version of the test statistic.

Type:: float

win_f

Size of the window (in number of frequency channels) used to compute a median filtered version of the test statistic.

Type:: float

nsigma_2d

Mask any frequency and time where the absolute deviation from the median filtered version is greater than this number of expected standard deviations given the number of degrees of freedom (i.e., number of baselines).

Type:: float

estimate_var

Estimate the variance in the test statistic using the median absolute deviation over a region defined by the win_t and win_f parameters.

Type:: bool

only_positive

Only mask large postive excursions in the test statistic, leaving large negative excursions unmasked.

Type:: bool

separate_pol

If true, construct a mask for each pol separately. If false, sum the chi-squared values over all polarisations and construct a single mask.

Type:: bool

mask_type

Algorithm to use to generate the mask.

Type:: {“mad”|”sumthreshold”}

niter

Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration.

Type:: int, optional

rho

Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations.

Type:: float, optional

max_m

Maximum size of the SumThreshold window to use.

Type:: int, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

mask_1d(y, m)[source]

Mask frequency channels where median chi-squared deviates from neighbors.

Parameters:

y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
m (np.ndarray[nfreq, ntime]) – Boolean mask that indicates which samples to ignore when calculating the median over time.

Returns:

mask – Boolean mask that indicates frequency channels where the median chi-squared over time deviates significantly from that of the neighboring channels.

Return type:

np.ndarray[nfreq]

mask_2d(y, w)[source]

Mask frequencies and times where the chi-squared deviates from local median.

Parameters:

y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.

Returns:

mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.

Return type:

np.ndarray[nfreq]

mask_2d_sumthreshold(y, w)[source]

Iterative application of sumthreshold algorithm to mask large chi-squared.

Parameters:

y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.

Returns:

mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.

Return type:

np.ndarray[nfreq]

process(stream)[source]

Generate a mask from the data.

Parameters:: stream (dcontainers.TimeStream | dcontainers.SiderealStream |) – dcontainers.HybridVisStream | dcontainers.RingMap Container holding a chi-squared test statistic in the visibility dataset. A weighted average will be taken over any axis that is not time/ra or frequency.
Returns:: mask – dcontainers.RFIMaskByPol | dcontainers.SiderealRFIMaskByPol Time-frequency mask, where values marked True are flagged.
Return type:: dcontainers.RFIMask | dcontainers.SiderealRFIMask |

setup(telescope=None)[source]

Save telescope object for time calculations.

Only used to convert (LSD, RA) to unix time when masking sidereal streams. Not required when masking time streams.

Parameters:: telescope (TransitTelescope) – Telescope object used for time calculations.

class draco.analysis.flagging.RFISensitivityMask[source]

Bases: SingleTask

Identify RFI as deviations in system sensitivity from expected radiometer noise.

mask_type

One of ‘mad’, ‘sumthreshold’ or ‘combine’. Default is combine, which uses the sumthreshold everywhere except around the transits of the sun and bright point sources, where it applies the MAD mask to avoid masking out the transits.

Type:: string, optional

include_pol

The list of polarisations to include. Default is to use all polarisations.

Type:: list of strings, optional

nsigma_1d

Construct a static mask by identifying any frequency channel whose quantile over time deviates from the median over frequency by more than this number of median absolute deviations. Default: 5.0

Type:: float, optional

quantile_1d

The quantile to use along time to construct the static mask. Default: 0.15

Type:: float, optional

win_f_1d

Number of frequency channels used to calculate a rolling median and median absolute deviation for the staic mask. Default: 191

Type:: int, optional

nsigma

The final threshold for the MAD, TV, and SumThreshold algorithms given as number of standard deviations. Default: 5.0

Type:: float, optional

niter

Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration. Default: 5

Type:: int, optional

rho

Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations. Default: 1.5

Type:: float, optional

base_size

The size of the region used to estimate the baseline, provided as (number of frequency channels, number of time samples). Default: (37, 181)

Type:: [int, int]

mad_size

The size of the region used to estimate the standard deviation, provided (number of frequency channels, number of time samples). Default: (101, 31)

Type:: [int, int]

tv_fraction

Fraction of bad samples in a digital TV channel that cause the whole channel to be flagged. Default: 0.5

Type:: float, optional

max_m

Maximum size of the SumThreshold window to use. Default: 64

Type:: int, optional

sir

Apply scale invariant rank (SIR) operator on top of final mask. Default: False

Type:: bool, optional

eta

Aggressiveness of the SIR operator. With eta=0, no additional samples are flagged and with eta=1, all samples will be flagged. Default: 0.2

Type:: float optional

only_time

Only apply the SIR operator along the time axis. Default: False

Type:: bool, optinal

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(sensitivity)[source]

Derive an RFI mask from sensitivity data.

Parameters:: sensitivity (containers.SystemSensitivity) – Sensitivity data.
Returns:: rfimask – RFI mask derived from sensitivity.
Return type:: containers.RFIMask

setup()[source]: Define the threshold as a function of iteration.

class draco.analysis.flagging.RFIStokesIMask[source]

Bases: ReduceVar

Two-stage RFI filter based on Stokes I visibilities.

Tries to independently target transient and persistant RFI.

Stage 1 is applied to each frequency independently. A high-pass filter is applied in RA to isolate transient RFI. The high-pass filtered visibilities are beamformed, and a MAD filter is applied to the resulting map. A time/RA sample is then flagged if some fraction of beams exceed the MAD threshold for that sample.

Stage 2 is applied across frequencies. A low-pass filter is applied in RA to reduce transient sky sources. The average visibility power is taken over 2+ cylinder separation baselines to obtain a single 1D array per frequency. These powers are gathered across all frequencies and a basic background subtraction is applied. Sumthreshold algorithm is then used for flagging, with a variance estimate used to boost the expected noise during the daytime and bright point source transits.

mad_base_size

Median absolute deviations base window. Default is [1, 101].

Type:: list of int, optional

mad_dev_size

Median absolute deviation median deviation window. Default is [1, 51].

Type:: list of int, optional

sigma_high

Median absolute deviations sigma threshold. Default is 8.0.

Type:: float, optional

sigma_low

Median absolute deviations low sigma threshold. A value above this threshold is masked only if it is either larger than sigma_high or it is larger than sigma_low AND connected to a region larger than sigma_high. Default is 2.0.

Type:: float, optional

frac_samples

Fraction of flagged samples in map space above which the entire time sample will be flagged. Default is 0.01.

Type:: float, optional

max_m

Maximum size of the SumThreshold window. Default is 64.

Type:: int, optional

nsigma

Initial threshold for SumThreshold. Default is 5.0.

Type:: float, optional

solar_var_boost

Variance boost during solar transit. Default is 1e4.

Type:: float, optional

bg_win_size

The size of the window used to estimate the background sky, provided as (number of frequency channels, number of time samples). Default is [11, 3].

Type:: list, optional

var_win_size

The size of the window used when estimating the variance, provided as (number of frequency channels, number of time samples). Default is [3, 31].

Type:: list, optional

lowpass_cutoff

Angular cutoff of the ra lowpass filter. Default is 7.5, which corresponds to about 30 minutes of observation time.

Type:: float, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

static apply_filter(vis, weight, samples, fcut, type_='high')[source]: Apply a high-pass or low-pass mmode filter.

mask_multi_channel(power, mask, times)[source]: Mask slow-moving narrow-band RFI.

mask_single_channel(vis, weight, mask, freq, baselines, ra)[source]: Mask scattered rfi.

process(stream)[source]

Make a mask from the data.

Parameters:

stream (dcontainers.TimeStream | dcontainers.SiderealStream) – Data to use when masking. Axes should be frequency, stack, and time-like.

Returns:

mask (dcontainers.RFIMask | dcontainers.SiderealRFIMask) – Time-frequency mask, where values marked True are flagged.
power (dcontainers.TimeStream | dcontainers.SiderealStream) – Time-frequency power metric used in second-stage flagging.

setup(telescope)[source]

Set up the baseline selections and ordering.

Parameters:: telescope (TransitTelescope) – The telescope object to use

class draco.analysis.flagging.RadiometerWeight[source]

Bases: SingleTask

Update vis_weight according to the radiometer equation.

\[\text{weight}_{ij} = N_\text{samp} / V_{ii} V_{jj}\]

replace

Replace any existing weights (default). If False then we multiply the existing weights by the radiometer values.

Type:: bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream)[source]

Change the vis weight.

Parameters:: stream (SiderealStream or TimeStream) – Data to be weighted. This is done in place.
Returns:: stream
Return type:: SiderealStream or TimeStream

class draco.analysis.flagging.ReduceMaskEl[source]

Bases: SingleTask

Reduce the ‘el’ axis from input classes and produce corresponding reduced output classes.

Reduction algorithm: If the number of True values in the mask along the el axis is higher than a given threshold, set the mask to True.

threshold

Type:: int

This number determines the minimum number of detected RFI events along the el axis required for a data point

to be included in the reduced mask. Default is 1.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(rfimask)[source]

Produce a RFI mask.

Parameters:: rfimask (containers.LocalizedRFIMask(freq, el, time) or containers.SiderealLocalizedRFIMask(freq, ra, el)) – El-specific RFI mask indicating channels that are free from RFI events.
Returns:: out – Non el-specific RFI mask indicating channels that are free from RFI events.
Return type:: containers.RFIMask(freq, time) or containers.SiderealRFIMask(freq, ra)

class draco.analysis.flagging.SanitizeWeights[source]

Bases: SingleTask

Flags weights outside of a valid range.

Flags any weights above a max threshold and below a minimum threshold. Baseline dependent, so only some baselines may be flagged.

max_thresh

largest value to keep

Type:: float

min_thresh

smallest value to keep

Type:: float

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data)[source]

Mask any weights outside of the threshold range.

Parameters:: data (andata.CorrData or containers.VisContainer object) – Data containing the weights to be flagged
Returns:: data – Data object with high/low weights masked in-place
Return type:: same object as data

setup()[source]

Validate the max and min values.

Raises:: ValueError – if min_thresh is larger than max_thresh

class draco.analysis.flagging.SiderealMaskConversion[source]

Bases: SingleTask

Convert the axis of an RFI mask from time to ra.

The conversion is performed by mapping values between Unix time and LSA using the geographic location of the telescope, as provided by the Observer object.

spread_size

The number of cells to flag before and after a detected true value. This ensures conservative flagging, preventing missed detections due to axis alignment issues. Default is 1.

Type:: int

npix

The number of pixels used to cover the full RA range from 0 to 360.: Defualt is 4096.

Type:: int

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(rfimask)[source]

Produce a RFI mask.

Parameters:: rfimask (containers.LocalizedRFIMask) – Container for holding a mask indicating channels that are free from RFI events. Its axes are freq, el, and time.
Returns:: out – Boolean mask that can be applied to a ringmap with the task ApplyLocalizedRFIMask to mask contaminated samples. Its axes are freq, ra, and el.
Return type:: containers.LocalizedSiderealRFIMask

setup(manager)[source]

Set the local observers position.

Parameters:: manager (Observer) – An Observer object holding the geographic location of the telescope. Note that TransitTelescope instances are also Observers.

class draco.analysis.flagging.SmoothVisWeight[source]

Bases: SingleTask

Smooth the visibility weights with a median filter.

This is done in-place.

kernel_size

Size of the kernel for the median filter in time points. Default is 31, corresponding to ~5 minutes window for 10s cadence data.

Type:: int, optional

mask_zeros

Mask out zero-weight entries when taking the moving weighted median.

Type:: bool, optional

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: TimeStream) → TimeStream[source]

Smooth the weights with a median filter.

Parameters:: data – Data containing the weights to be smoothed
Returns:: Data object containing the same data as the input, but with the weights substituted by the smoothed ones.
Return type:: data

class draco.analysis.flagging.TaperDelayTransform[source]

Bases: SingleTask

Apply a taper or mask to a DelayTransform container.

This task applies a frequency-collapsed taper or mask to the delay-domain representation of ringmaps. Because DelayTransform containers are indexed over (baseline_axes, sample, delay), the taper or mask must first be averaged or collapsed over frequency and then reshaped to align with the baseline axes. This operation is necessary due to the mismatch between the frequency-dependent structure of the taper/mask and the frequency-transformed delay axis.

update_weight

If True, update the weights to account for the applied taper. This multiplies the weights by 1 / taper^2 in all unmasked regions.

Type:: bool

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(data: DelayTransform, apply: RingMapTaper | RingMapMask)[source]

Apply the taper or mask to the DelayTransform container.

Parameters:

data (containers.DelayTransform) – The dataset to be modified in-place. Must contain a ‘spectrum’ dataset with shape (…, sample, delay), where ‘sample’ corresponds to RA, and a ‘weight’ dataset of the same shape.
apply (RingMapTaper or RingMapMask) – A container providing the taper or mask to apply. For a RingMapTaper, the taper will be averaged over frequency. For a RingMapMask, pixels that are good in all frequency channels will be treated as 1.0 and others as 0.0.

Returns:

data – The input DelayTransform container with ‘spectrum’ and optionally ‘weight’ modified in-place.

Return type:

containers.DelayTransform

class draco.analysis.flagging.ThresholdVisWeightBaseline[source]

Bases: SingleTask

Form a mask corresponding to weights that are below some threshold.

The threshold is determined as maximum(absolute_threshold, relative_threshold * average(weight)) and is evaluated per product/stack entry. The user can specify whether to use a mean or median as the average, but note that the mean is much more likely to be biased by anomalously high- or low-weight samples (both of which are present in raw CHIME data). The user can also specify that weights below some threshold should not be considered when taking the average and constructing the mask (the default is to only ignore zero-weight samples).

The task outputs a BaselineMask or SiderealBaselineMask depending on the input container.

Parameters:

average_type (string, optional) – Type of average to use (“median” or “mean”). Default: “median”.
absolute_threshold (float, optional) – Any weights with values less than this number will be set to zero. Default: 1e-7.
relative_threshold (float, optional) – Any weights with values less than this number times the average weight will be set to zero. Default: 1e-6.
ignore_absolute_threshold (float, optional) – Any weights with values less than this number will be ignored when taking averages and constructing the mask. Default: 0.0.
pols_to_flag (string, optional) – Which polarizations to flag. “copol” only flags XX and YY baselines, while “all” flags everything. Default: “all”.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream) → BaselineMask | SiderealBaselineMask[source]

Construct baseline-dependent mask.

Parameters:: stream (.core.container with weight attribute) – Input container whose weights are used to construct the mask.
Returns:: out – The output baseline-dependent mask.
Return type:: BaselineMask or SiderealBaselineMask

setup(telescope)[source]

Set the telescope model.

Parameters:: telescope (TransitTelescope) – The telescope object to use

class draco.analysis.flagging.ThresholdVisWeightFrequency[source]

Bases: SingleTask

Create a mask to remove all weights below a per-frequency threshold.

A single relative threshold is set for each frequency along with an absolute minimum weight threshold. Masking is done relative to the mean baseline.

Parameters:

absolute_threshold (float) – Any weights with values less than this number will be set to zero.
relative_threshold (float) – Any weights with values less than this number times the average weight will be set to zero.

Initialize pipeline task.

May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.

process(stream)[source]

Make a baseline-independent mask.

Parameters:: stream (.core.container with weight attribute) – Container to mask
Returns:: out – RFIMask container with mask set
Return type:: RFIMask container

draco.analysis.flagging.complex_med(x, *args, **kwargs)[source]

Complex median, done by applying to the real/imag parts individually.

Parameters:

x (np.ndarray) – Array to apply to.
*args (list, dict) – Passed straight through to np.nanmedian
**kwargs (list, dict) – Passed straight through to np.nanmedian

Returns:

m – Median.

Return type:

np.ndarray

draco.analysis.flagging.destripe(x, w, axis=1)[source]

Subtract the median along a specified axis.

Parameters:

x (np.ndarray) – Array to destripe.
w (np.ndarray) – Mask array for points to include (True) or ignore (False).
axis (int, optional) – Axis to apply destriping along.

Returns:

y – Destriped array.

Return type:

np.ndarray

draco.analysis.flagging.inverse_binom_cdf_prob(k, N, F)[source]

Calculate the trial probability that gives the CDF.

This gets the trial probability that gives an overall cumulative probability for Pr(X <= k; N, p) = F

Parameters:

k (int) – Maximum number of successes.
N (int) – Total number of trials.
F (float) – The cumulative probability for (k, N).

Returns:

p – The trial probability.

Return type:

float

draco.analysis.flagging.mad(x, mask, base_size=(11, 3), mad_size=(21, 21), debug=False, sigma=True)[source]

Calculate the MAD of freq-time data.

Parameters:

x (np.ndarray) – Data to filter.
mask (np.ndarray) – Initial mask.
base_size (tuple) – Size of the window to use in (freq, time) when estimating the baseline.
mad_size (tuple) – Size of the window to use in (freq, time) when estimating the MAD.
debug (bool, optional) – If True, return deviation and mad arrays as well
sigma (bool, optional) – Rescale the output into units of Gaussian sigmas.

Returns:

mad – Size of deviation at each point in MAD units. This output may contain NaN’s for regions of missing data.

Return type:

np.ndarray

draco.analysis.flagging.medfilt(x, mask, size, *args)[source]

Apply a moving median filter to masked data.

The application is done by iterative filling to overcome the fact we don’t have an actual implementation of a nanmedian.

Parameters:

x (np.ndarray) – Data to filter.
mask (np.ndarray) – Mask of data to filter out.
size (tuple) – Size of the window in each dimension.
args (optional) – Additional arguments to pass to the moving weighted median

Returns:

y – The masked data. Data within the mask is undefined.

Return type:

np.ndarray

draco.analysis.flagging.p_to_sigma(p)[source]: Get the sigma exceeded by the tails of a Gaussian with probability p.

draco.analysis.flagging.sigma_to_p(sigma)[source]: Get the probability of an excursion larger than sigma for a Gaussian.

draco.analysis.flagging.tv_channels_flag(x, freq, sigma=5, f=0.5, debug=False)[source]

Perform a higher sensitivity flagging for the TV stations.

This flags a whole TV station band if more than fraction f of the samples within a station band exceed a given threshold. The threshold is calculated by wanting a fixed false positive rate (as described by sigma) for fraction f of samples exceeding the threshold

Parameters:

x (np.ndarray[freq, time]) – Deviations of data in sigma units.
freq (np.ndarray[freq]) – Frequency of samples in MHz.
sigma (float, optional) – The probability of a false positive given as a sigma of a Gaussian.
f (float, optional) – Fraction of bad samples within each channel before flagging the whole thing.
debug (bool, optional) – Returns (mask, fraction) instead to give extra debugging info.

Returns:

mask – Mask of the input data.

Return type:

np.ndarray[bool]