draco.analysis.flagging
Tasks for flagging out bad or unwanted data.
This includes data quality flagging on timestream data; sun excision on sidereal data; and pre-map making flagging on m-modes.
The convention for flagging/masking is True for contaminated samples that should be excluded and False for clean samples.
Functions
|
Complex median, done by applying to the real/imag parts individually. |
|
Subtract the median along a specified axis. |
|
Calculate the trial probability that gives the CDF. |
|
Calculate the MAD of freq-time data. |
|
Apply a moving median filter to masked data. |
|
Get the sigma exceeded by the tails of a Gaussian with probability p. |
|
Get the probability of an excursion larger than sigma for a Gaussian. |
|
Perform a higher sensitivity flagging for the TV stations. |
Classes
Apply a distributed mask that varies across baselines. |
|
Apply a mask to a dataset with arbitrary axes. |
|
Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights. |
|
alias of |
|
Apply a time-frequency mask to the data. |
|
Mix a small amount of a stack into data to regularise RFI gaps. |
|
Collapse a baseline-dependent mask along the baseline axis. |
|
Combine an arbitrary number of masks (conservatively). |
|
|
Crudely simulate a masking out of the daytime data. |
Identify beamformed visibilities that deviate from our expectation for noise. |
|
Get a mask of regions with bad gain. |
|
Mask out baselines from a dataset. |
|
alias of |
|
Mask beamformed visibilities with anomalously large weights before stacking. |
|
alias of |
|
|
Make a mask for certain frequencies. |
Mask out mmode data ahead of map making. |
|
|
Crappy RFI masking. |
Mask frequencies and times with anomalous chi-squared test statistic. |
|
Identify RFI as deviations in system sensitivity from expected radiometer noise. |
|
Two-stage RFI filter based on Stokes I visibilities. |
|
Update vis_weight according to the radiometer equation. |
|
Reduce the 'el' axis from input classes and produce corresponding reduced output classes. |
|
Flags weights outside of a valid range. |
|
Convert the axis of an RFI mask from time to ra. |
|
Smooth the visibility weights with a median filter. |
|
Form a mask corresponding to weights that are below some threshold. |
|
Create a mask to remove all weights below a per-frequency threshold. |
- class draco.analysis.flagging.ApplyBaselineMask[source]
Bases:
SingleTask
Apply a distributed mask that varies across baselines.
No broadcasting is done, so the data and mask should have the same axes. This shouldn’t be used for non-distributed time-freq masks.
This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.
Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.
- Type:
{“all”, “none”, “vis”, “map”}
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data: TimeStream, mask: BaselineMask) TimeStream [source]
- process(data: SiderealStream, mask: SiderealBaselineMask) SiderealStream
Flag data by zeroing the weights.
- Parameters:
data – Data to apply mask to. Must have a stack axis
mask – A baseline-dependent mask
- Returns:
The masked data. Masking is done in place.
- Return type:
data
- class draco.analysis.flagging.ApplyGenericMask[source]
Bases:
SingleTask
Apply a mask to a dataset with arbitrary axes.
All of the mask axes must be present in the dataset, but the dataset can have additional axes.
Assumes that a sample marked True in the mask dataset should be flagged.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data: ContainerBase, mask: ContainerBase)[source]
Apply the mask to the dataset weights.
Reorder the mask axes and add broadcasting axes if necessary.
- Parameters:
data – Any container with a frequency axis.
mask – Any container whose axes are a subset of the axes in data
- Returns:
The input container with the weight dataset set to zero for masked samples.
- Return type:
data
- class draco.analysis.flagging.ApplyLocalizedRFIMask[source]
Bases:
SingleTask
Apply a localised (el-sensitive) RFI mask to the data by zeroing the weights.
This class extends the class ApplyTimeFreqMask to include el in addition to freq and ra, and can be further extended for a new RingMap class (freq,el,time). Note that while the ra and el axes of the tstream and mask datasets do not need to be identical, they must have overlapping regions. However, their freq axes must be identical.
Which datasets should we share with the input. If “none” we create a full copy of the data, if “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.
- Type:
{“all”, “none”, “map”}
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(tstream, rfimask)[source]
Apply the mask by zeroing the weights.
- Parameters:
tstream (containers.RingMap) – A data container with axes (pol, freq, ra, el).
rfimask (containers.LocalizedSiderealRFIMask(freq, ra, el)) – An RFI mask with overlapping freq, ra and el regions with the tstream, containers.RingMap.
- Returns:
tstream – The masked RingMap with weights modified in overlapping regions. Note that the masking is done in place.
- Return type:
- draco.analysis.flagging.ApplyRFIMask
alias of
ApplyTimeFreqMask
- class draco.analysis.flagging.ApplyTimeFreqMask[source]
Bases:
SingleTask
Apply a time-frequency mask to the data.
Typically this is used to mask out all inputs at times and frequencies contaminated by RFI.
This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.
Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” or “map” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.
- Type:
{“all”, “none”, “vis”, “map”}
- collapse_pol
Take the logical OR of the mask along the polarisation axis prior to applying it to the data. In other words, mask a frequency and time in all polarisations if it was identified as contaminated in any polarisation.
- Type:
bool
- match_axes
If True (default), the rfimask and tstream must have identical time-like axis. Otherwise, the mask is applied only to the overlapping region of the time-like axis. Non-overlapping regions remain unchanged. Samples must still have the same RA or timestamp values in overlapping regions.
- Type:
bool, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(tstream, rfimask)[source]
Apply the mask by zeroing the weights.
- Parameters:
tstream (timestream or sidereal stream) – A timestream or sidereal stream like container. For example, containers.TimeStream, andata.CorrData or containers.SiderealStream.
rfimask (containers.RFIMask, containers.RFIMaskByPol,) – containers.SiderealRFIMask, containers.SiderealRFIMaskByPol An RFI mask for the same period of time.
- Returns:
tstream – The masked timestream. Note that the masking is done in place.
- Return type:
timestream or sidereal stream
- class draco.analysis.flagging.BlendStack[source]
Bases:
SingleTask
Mix a small amount of a stack into data to regularise RFI gaps.
This is designed to mix in a small amount of a stack into a day of data (which will have RFI masked gaps) to attempt to regularise operations which struggle to deal with time variable masks, e.g. DelaySpectrumEstimator.
- frac
The relative weight to give the stack in the average. This multiplies the weights already in the stack, and so it should be remembered that these may already be significantly higher than the single day weights.
- Type:
float, optional
- match_median
Estimate the median in the time/RA direction from the common samples and use this to match any quasi time-independent bias of the data (e.g. cross talk).
- Type:
bool, optional
- subtract
Rather than taking an average, instead subtract out the blending stack from the input data in the common samples to calculate the difference between them. The interpretation of frac is a scaling of the inverse variance of the stack to an inverse variance of a prior on the difference, e.g. a frac = 1e-4 means that we expect the standard deviation of the difference between the data and the stacked data to be 100x larger than the noise of the stacked data.
- Type:
bool, optional
- mask_freq
Maintain masking if a frequency is entirely flagged - i.e., even if blending data exists in those bands, do not blend.
- Type:
bool, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data)[source]
Blend a small amount of the stack into the incoming data.
- Parameters:
data (SiderealStream, RingMap,or HybridVisStream) – The data to be blended into. This is modified in place.
- Returns:
data_blend – The modified data. This is the same object as the input, and it has been modified in place.
- Return type:
- setup(data_stack)[source]
Set the stacked data.
- Parameters:
data_stack (SiderealStream, RingMap,or HybridVisStream) – Data stack to blend
- class draco.analysis.flagging.CollapseBaselineMask[source]
Bases:
SingleTask
Collapse a baseline-dependent mask along the baseline axis.
The output is a frequency/time mask that is True for any freq/time sample for which any baseline is masked in the input mask.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(baseline_mask: BaselineMask | SiderealBaselineMask) RFIMask | SiderealRFIMask [source]
Collapse input mask over baseline axis.
- Parameters:
baseline_mask (BaselineMask or SiderealBaselineMask) – Input baseline-dependent mask
- Returns:
mask_cont – Output baseline-independent mask.
- Return type:
RFIMask or SiderealRFIMask
- class draco.analysis.flagging.CombineMasks[source]
Bases:
SingleTask
Combine an arbitrary number of masks (conservatively).
All of the given masks must be of the same type and that type must have a mask dataset. Any flagged value in any of the provided masks will be flagged in the output mask.
Assumes that a sample marked True is flagged.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(masks: list[ContainerBase])[source]
Combine the given list of masks into a single mask.
- Parameters:
masks – A list of containers that all have the same type. The type must have a mask dataset.
- Returns:
A combined mask such that any flagged value in any of the input masks is flagged in the output mask.
- Return type:
combined_mask
- class draco.analysis.flagging.DayMask[source]
Bases:
SingleTask
Crudely simulate a masking out of the daytime data.
- start, end
Start and end of masked out region.
- Type:
float
- width
Use a smooth transition of given width between the fully masked and unmasked data. This is interior to the region marked by start and end.
- Type:
float
- zero_data
Zero the data in addition to modifying the noise weights (default is True).
- Type:
bool, optional
- remove_average
Estimate and remove the mean level from each visibilty. This estimate does not use data from the masked region.
- Type:
bool, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(sstream)[source]
Apply a day time mask.
- Parameters:
sstream (containers.SiderealStream) – Unmasked sidereal stack.
- Returns:
mstream – Masked sidereal stream.
- Return type:
- class draco.analysis.flagging.FindBeamformedOutliers[source]
Bases:
SingleTask
Identify beamformed visibilities that deviate from our expectation for noise.
- nsigma
Beamformed visibilities whose magnitude is greater than nsigma times the expected standard deviation of the noise, given by sqrt(1 / weight), will be masked.
- Type:
float
- window
If provided, the outlier mask will be extended to cover neighboring pixels. This list provides the number of pixels in each dimension that a single outlier will mask. Only supported for RingMap containers, where the list should be length 2 with [nra, nel], and FormedBeamHA containers, where the list should be length 1 with [nha,].
- Type:
list of int
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data)[source]
Create a mask that indicates outlier beamformed visibilities.
- Parameters:
data (FormedBeam, FormedBeamHA, or RingMap) – Beamformed visibilities.
- Returns:
out – Container with a boolean mask where True indicates outlier beamformed visibilities.
- Return type:
- class draco.analysis.flagging.MaskBadGains[source]
Bases:
SingleTask
Get a mask of regions with bad gain.
Assumes that bad gains are set to 1.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- class draco.analysis.flagging.MaskBaselines[source]
Bases:
SingleTask
Mask out baselines from a dataset.
This task may produce output with shared datasets. Be warned that this can produce unexpected outputs if not properly taken into account.
- mask_long_ns
Mask out baselines longer than a given distance in the N/S direction.
- Type:
float, optional
- mask_short
Mask out baselines shorter than a given distance.
- Type:
float, optional
- mask_short_ew
Mask out baselines shorter then a given distance in the East-West direction. Useful for masking out intra-cylinder baselines for North-South oriented cylindrical telescopes.
- Type:
float, optional
- mask_short_ns
Mask out baselines shorter then a given distance in the North-South direction.
- Type:
float, optional
- missing_threshold
Mask any baseline that is missing more than this fraction of samples. This is measured relative to other baselines.
- Type:
float, optional
- zero_data
Zero the data in addition to modifying the noise weights (default is False).
- Type:
bool, optional
Which datasets should we share with the input. If “none” we create a full copy of the data, if “vis” we create a copy only of the modified weight dataset and the unmodified vis dataset is shared, if “all” we modify in place and return the input container.
- Type:
{“all”, “none”, “vis”}
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(ss)[source]
Apply the mask to data.
- Parameters:
ss (SiderealStream or TimeStream) – Data to mask. Applied in place.
- draco.analysis.flagging.MaskBeamformedOutliers
alias of
ApplyGenericMask
- class draco.analysis.flagging.MaskBeamformedWeights[source]
Bases:
SingleTask
Mask beamformed visibilities with anomalously large weights before stacking.
- nmed
Any weight that is more than nmed times the median weight over all objects and frequencies will be set to zero. Default is 8.0.
- Type:
float
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data)[source]
Mask large weights.
- Parameters:
data (FormedBeam) – Beamformed visibilities.
- Returns:
data – The input container with the weight dataset set to zero if the weights exceed the threshold.
- Return type:
- draco.analysis.flagging.MaskData
alias of
MaskMModeData
- class draco.analysis.flagging.MaskFreq[source]
Bases:
SingleTask
Make a mask for certain frequencies.
- bad_freq_ind
A list containing frequencies to flag out. Each entry can either be an integer giving an individual frequency index to remove, or 2-tuples giving start and end indices of a range to flag (as with a standard slice, the end is not included.)
- Type:
list, optional
- factorize
Find the smallest factorizable mask of the time-frequency axis that covers all samples already flagged in the data.
- Type:
bool, optional
- all_time
Only include frequencies where all time samples are present.
- Type:
bool, optional
- mask_missing_data
Mask time-freq samples where some baselines (for visibily data) or polarisations/elevations (for ring map data) are missing.
- Type:
bool, optional
- freq_frac
Fully mask any frequency where the fraction of unflagged samples is less than this value. Default is None.
- Type:
float, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data: VisContainer | RingMap) RFIMask | SiderealRFIMask [source]
Make the mask.
- Parameters:
data – The data to mask.
- Returns:
Frequency mask container
- Return type:
mask_cont
- class draco.analysis.flagging.MaskMModeData[source]
Bases:
SingleTask
Mask out mmode data ahead of map making.
- auto_correlations
Exclude auto correlations if set (default=False).
- Type:
bool
- m_zero
Ignore the m=0 mode (default=False).
- Type:
bool
- positive_m
Include positive m-modes (default=True).
- Type:
bool
- negative_m
Include negative m-modes (default=True).
- Type:
bool
- mask_low_m
If set, mask out m’s lower than this threshold.
- Type:
int, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(mmodes)[source]
Mask out unwanted datain the m-modes.
- Parameters:
mmodes (containers.MModes) – Mmode container to mask
- Returns:
mmodes – Same object as input with masking applied
- Return type:
- class draco.analysis.flagging.RFIMask[source]
Bases:
SingleTask
Crappy RFI masking.
- sigma
The false positive rate of the flagger given as sigma value assuming the non-RFI samples are Gaussian.
- Type:
float, optional
- tv_fraction
Number of bad samples in a digital TV channel that cause the whole channel to be flagged.
- Type:
float, optional
- stack_ind
Which stack to process to derive flags for the whole dataset.
- Type:
int
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(sstream: SiderealStream) SiderealRFIMask [source]
- process(sstream: TimeStream) RFIMask
Apply a day time mask.
- Parameters:
sstream – Unmasked sidereal or time stream visibility data.
- Returns:
The derived RFI mask.
- Return type:
mask
- class draco.analysis.flagging.RFIMaskChisqHighDelay[source]
Bases:
SingleTask
Mask frequencies and times with anomalous chi-squared test statistic.
- flag_ew
If the input container has an east-west baseline axis, then this flag will be applied to the weights before collapsing over that axis.
- Type:
array
- reg_arpls
Smoothness regularisation used when estimating the baseline for flagging bad frequencies. Default is 1e5.
- Type:
float
- nsigma_1d
Mask any frequency where the median over unmasked time samples deviates from the baseline by more than this number of median absolute deviations. Default is 5.0.
- Type:
float
- win_t
Size of the window (in number of time samples) used to compute a median filtered version of the test statistic.
- Type:
float
- win_f
Size of the window (in number of frequency channels) used to compute a median filtered version of the test statistic.
- Type:
float
- nsigma_2d
Mask any frequency and time where the absolute deviation from the median filtered version is greater than this number of expected standard deviations given the number of degrees of freedom (i.e., number of baselines).
- Type:
float
- estimate_var
Estimate the variance in the test statistic using the median absolute deviation over a region defined by the win_t and win_f parameters.
- Type:
bool
- only_positive
Only mask large postive excursions in the test statistic, leaving large negative excursions unmasked.
- Type:
bool
- separate_pol
If true, construct a mask for each pol separately. If false, sum the chi-squared values over all polarisations and construct a single mask.
- Type:
bool
- mask_type
Algorithm to use to generate the mask.
- Type:
{“mad”|”sumthreshold”}
- niter
Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration.
- Type:
int, optional
- rho
Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations.
- Type:
float, optional
- max_m
Maximum size of the SumThreshold window to use.
- Type:
int, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- mask_1d(y, m)[source]
Mask frequency channels where median chi-squared deviates from neighbors.
- Parameters:
y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
m (np.ndarray[nfreq, ntime]) – Boolean mask that indicates which samples to ignore when calculating the median over time.
- Returns:
mask – Boolean mask that indicates frequency channels where the median chi-squared over time deviates significantly from that of the neighboring channels.
- Return type:
np.ndarray[nfreq]
- mask_2d(y, w)[source]
Mask frequencies and times where the chi-squared deviates from local median.
- Parameters:
y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.
- Returns:
mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.
- Return type:
np.ndarray[nfreq]
- mask_2d_sumthreshold(y, w)[source]
Iterative application of sumthreshold algorithm to mask large chi-squared.
- Parameters:
y (np.ndarray[nfreq, ntime]) – Chi-squared per degree of freedom.
w (np.ndarray[nfreq, ntime]) – Inverse variance of the chi-squared per degree of freedom, with zero indicating previously masked samples.
- Returns:
mask – Boolean mask that indicates frequencies and times where chi-squared deviates significantly from the local median.
- Return type:
np.ndarray[nfreq]
- process(stream)[source]
Generate a mask from the data.
- Parameters:
stream (dcontainers.TimeStream | dcontainers.SiderealStream |) – dcontainers.HybridVisStream | dcontainers.RingMap Container holding a chi-squared test statistic in the visibility dataset. A weighted average will be taken over any axis that is not time/ra or frequency.
- Returns:
mask – dcontainers.RFIMaskByPol | dcontainers.SiderealRFIMaskByPol Time-frequency mask, where values marked True are flagged.
- Return type:
dcontainers.RFIMask | dcontainers.SiderealRFIMask |
- class draco.analysis.flagging.RFISensitivityMask[source]
Bases:
SingleTask
Identify RFI as deviations in system sensitivity from expected radiometer noise.
- mask_type
One of ‘mad’, ‘sumthreshold’ or ‘combine’. Default is combine, which uses the sumthreshold everywhere except around the transits of the sun and bright point sources, where it applies the MAD mask to avoid masking out the transits.
- Type:
string, optional
- include_pol
The list of polarisations to include. Default is to use all polarisations.
- Type:
list of strings, optional
- nsigma_1d
Construct a static mask by identifying any frequency channel whose quantile over time deviates from the median over frequency by more than this number of median absolute deviations. Default: 5.0
- Type:
float, optional
- quantile_1d
The quantile to use along time to construct the static mask. Default: 0.15
- Type:
float, optional
- win_f_1d
Number of frequency channels used to calculate a rolling median and median absolute deviation for the staic mask. Default: 191
- Type:
int, optional
- nsigma
The final threshold for the MAD, TV, and SumThreshold algorithms given as number of standard deviations. Default: 5.0
- Type:
float, optional
- niter
Number of iterations. At each iterations the baseline and standard deviation are re-estimated using the mask from the previous iteration. Default: 5
- Type:
int, optional
- rho
Reduce the threshold by this factor at each iteration. A value of 1 will keep the threshold constant for all iterations. Default: 1.5
- Type:
float, optional
- base_size
The size of the region used to estimate the baseline, provided as (number of frequency channels, number of time samples). Default: (37, 181)
- Type:
[int, int]
- mad_size
The size of the region used to estimate the standard deviation, provided (number of frequency channels, number of time samples). Default: (101, 31)
- Type:
[int, int]
- tv_fraction
Fraction of bad samples in a digital TV channel that cause the whole channel to be flagged. Default: 0.5
- Type:
float, optional
- max_m
Maximum size of the SumThreshold window to use. Default: 64
- Type:
int, optional
- sir
Apply scale invariant rank (SIR) operator on top of final mask. Default: False
- Type:
bool, optional
- eta
Aggressiveness of the SIR operator. With eta=0, no additional samples are flagged and with eta=1, all samples will be flagged. Default: 0.2
- Type:
float optional
- only_time
Only apply the SIR operator along the time axis. Default: False
- Type:
bool, optinal
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(sensitivity)[source]
Derive an RFI mask from sensitivity data.
- Parameters:
sensitivity (containers.SystemSensitivity) – Sensitivity data.
- Returns:
rfimask – RFI mask derived from sensitivity.
- Return type:
- class draco.analysis.flagging.RFIStokesIMask[source]
Bases:
ReduceVar
Two-stage RFI filter based on Stokes I visibilities.
Tries to independently target transient and persistant RFI.
Stage 1 is applied to each frequency independently. A high-pass filter is applied in RA to isolate transient RFI. The high-pass filtered visibilities are beamformed, and a MAD filter is applied to the resulting map. A time/RA sample is then flagged if some fraction of beams exceed the MAD threshold for that sample.
Stage 2 is applied across frequencies. A low-pass filter is applied in RA to reduce transient sky sources. The average visibility power is taken over 2+ cylinder separation baselines to obtain a single 1D array per frequency. These powers are gathered across all frequencies and a basic background subtraction is applied. Sumthreshold algorithm is then used for flagging, with a variance estimate used to boost the expected noise during the daytime and bright point source transits.
- mad_base_size
Median absolute deviations base window. Default is [1, 101].
- Type:
list of int, optional
- mad_dev_size
Median absolute deviation median deviation window. Default is [1, 51].
- Type:
list of int, optional
- sigma_high
Median absolute deviations sigma threshold. Default is 8.0.
- Type:
float, optional
- sigma_low
Median absolute deviations low sigma threshold. A value above this threshold is masked only if it is either larger than sigma_high or it is larger than sigma_low AND connected to a region larger than sigma_high. Default is 2.0.
- Type:
float, optional
- frac_samples
Fraction of flagged samples in map space above which the entire time sample will be flagged. Default is 0.01.
- Type:
float, optional
- max_m
Maximum size of the SumThreshold window. Default is 64.
- Type:
int, optional
- nsigma
Initial threshold for SumThreshold. Default is 5.0.
- Type:
float, optional
- solar_var_boost
Variance boost during solar transit. Default is 1e4.
- Type:
float, optional
- bg_win_size
The size of the window used to estimate the background sky, provided as (number of frequency channels, number of time samples). Default is [11, 3].
- Type:
list, optional
- var_win_size
The size of the window used when estimating the variance, provided as (number of frequency channels, number of time samples). Default is [3, 31].
- Type:
list, optional
- lowpass_cutoff
Angular cutoff of the ra lowpass filter. Default is 7.5, which corresponds to about 30 minutes of observation time.
- Type:
float, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- static apply_filter(vis, weight, samples, fcut, type_='high')[source]
Apply a high-pass or low-pass mmode filter.
- process(stream)[source]
Make a mask from the data.
- Parameters:
stream (dcontainers.TimeStream | dcontainers.SiderealStream) – Data to use when masking. Axes should be frequency, stack, and time-like.
- Returns:
mask (dcontainers.RFIMask | dcontainers.SiderealRFIMask) – Time-frequency mask, where values marked True are flagged.
power (dcontainers.TimeStream | dcontainers.SiderealStream) – Time-frequency power metric used in second-stage flagging.
- class draco.analysis.flagging.RadiometerWeight[source]
Bases:
SingleTask
Update vis_weight according to the radiometer equation.
\[\text{weight}_{ij} = N_\text{samp} / V_{ii} V_{jj}\]- replace
Replace any existing weights (default). If False then we multiply the existing weights by the radiometer values.
- Type:
bool, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(stream)[source]
Change the vis weight.
- Parameters:
stream (SiderealStream or TimeStream) – Data to be weighted. This is done in place.
- Returns:
stream
- Return type:
- class draco.analysis.flagging.ReduceMaskEl[source]
Bases:
SingleTask
Reduce the ‘el’ axis from input classes and produce corresponding reduced output classes.
Reduction algorithm: If the number of True values in the mask along the el axis is higher than a given threshold, set the mask to True.
- threshold
- Type:
int
- This number determines the minimum number of detected RFI events along the el axis required for a data point
- to be included in the reduced mask. Default is 1.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(rfimask)[source]
Produce a RFI mask.
- Parameters:
rfimask (containers.LocalizedRFIMask(freq, el, time) or containers.SiderealLocalizedRFIMask(freq, ra, el)) – El-specific RFI mask indicating channels that are free from RFI events.
- Returns:
out – Non el-specific RFI mask indicating channels that are free from RFI events.
- Return type:
containers.RFIMask(freq, time) or containers.SiderealRFIMask(freq, ra)
- class draco.analysis.flagging.SanitizeWeights[source]
Bases:
SingleTask
Flags weights outside of a valid range.
Flags any weights above a max threshold and below a minimum threshold. Baseline dependent, so only some baselines may be flagged.
- max_thresh
largest value to keep
- Type:
float
- min_thresh
smallest value to keep
- Type:
float
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- class draco.analysis.flagging.SiderealMaskConversion[source]
Bases:
SingleTask
Convert the axis of an RFI mask from time to ra.
The conversion is performed by mapping values between Unix time and LSA using the geographic location of the telescope, as provided by the Observer object.
- spread_size
The number of cells to flag before and after a detected true value. This ensures conservative flagging, preventing missed detections due to axis alignment issues. Default is 1.
- Type:
int
- npix
- The number of pixels used to cover the full RA range from 0 to 360.
Defualt is 4096.
- Type:
int
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(rfimask)[source]
Produce a RFI mask.
- Parameters:
rfimask (containers.LocalizedRFIMask) – Container for holding a mask indicating channels that are free from RFI events. Its axes are freq, el, and time.
- Returns:
out – Boolean mask that can be applied to a ringmap with the task ApplyLocalizedRFIMask to mask contaminated samples. Its axes are freq, ra, and el.
- Return type:
- class draco.analysis.flagging.SmoothVisWeight[source]
Bases:
SingleTask
Smooth the visibility weights with a median filter.
This is done in-place.
- kernel_size
Size of the kernel for the median filter in time points. Default is 31, corresponding to ~5 minutes window for 10s cadence data.
- Type:
int, optional
- mask_zeros
Mask out zero-weight entries when taking the moving weighted median.
- Type:
bool, optional
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(data: TimeStream) TimeStream [source]
Smooth the weights with a median filter.
- Parameters:
data – Data containing the weights to be smoothed
- Returns:
Data object containing the same data as the input, but with the weights substituted by the smoothed ones.
- Return type:
data
- class draco.analysis.flagging.ThresholdVisWeightBaseline[source]
Bases:
SingleTask
Form a mask corresponding to weights that are below some threshold.
The threshold is determined as maximum(absolute_threshold, relative_threshold * average(weight)) and is evaluated per product/stack entry. The user can specify whether to use a mean or median as the average, but note that the mean is much more likely to be biased by anomalously high- or low-weight samples (both of which are present in raw CHIME data). The user can also specify that weights below some threshold should not be considered when taking the average and constructing the mask (the default is to only ignore zero-weight samples).
The task outputs a BaselineMask or SiderealBaselineMask depending on the input container.
- Parameters:
average_type (string, optional) – Type of average to use (“median” or “mean”). Default: “median”.
absolute_threshold (float, optional) – Any weights with values less than this number will be set to zero. Default: 1e-7.
relative_threshold (float, optional) – Any weights with values less than this number times the average weight will be set to zero. Default: 1e-6.
ignore_absolute_threshold (float, optional) – Any weights with values less than this number will be ignored when taking averages and constructing the mask. Default: 0.0.
pols_to_flag (string, optional) – Which polarizations to flag. “copol” only flags XX and YY baselines, while “all” flags everything. Default: “all”.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- process(stream) BaselineMask | SiderealBaselineMask [source]
Construct baseline-dependent mask.
- Parameters:
stream (.core.container with weight attribute) – Input container whose weights are used to construct the mask.
- Returns:
out – The output baseline-dependent mask.
- Return type:
BaselineMask or SiderealBaselineMask
- class draco.analysis.flagging.ThresholdVisWeightFrequency[source]
Bases:
SingleTask
Create a mask to remove all weights below a per-frequency threshold.
A single relative threshold is set for each frequency along with an absolute minimum weight threshold. Masking is done relative to the mean baseline.
- Parameters:
absolute_threshold (float) – Any weights with values less than this number will be set to zero.
relative_threshold (float) – Any weights with values less than this number times the average weight will be set to zero.
Initialize pipeline task.
May be overridden with no arguments. Will be called after any config.Property attributes are set and after ‘input’ and ‘requires’ keys are set up.
- draco.analysis.flagging.complex_med(x, *args, **kwargs)[source]
Complex median, done by applying to the real/imag parts individually.
- Parameters:
x (np.ndarray) – Array to apply to.
*args (list, dict) – Passed straight through to np.nanmedian
**kwargs (list, dict) – Passed straight through to np.nanmedian
- Returns:
m – Median.
- Return type:
np.ndarray
- draco.analysis.flagging.destripe(x, w, axis=1)[source]
Subtract the median along a specified axis.
- Parameters:
x (np.ndarray) – Array to destripe.
w (np.ndarray) – Mask array for points to include (True) or ignore (False).
axis (int, optional) – Axis to apply destriping along.
- Returns:
y – Destriped array.
- Return type:
np.ndarray
- draco.analysis.flagging.inverse_binom_cdf_prob(k, N, F)[source]
Calculate the trial probability that gives the CDF.
This gets the trial probability that gives an overall cumulative probability for Pr(X <= k; N, p) = F
- Parameters:
k (int) – Maximum number of successes.
N (int) – Total number of trials.
F (float) – The cumulative probability for (k, N).
- Returns:
p – The trial probability.
- Return type:
float
- draco.analysis.flagging.mad(x, mask, base_size=(11, 3), mad_size=(21, 21), debug=False, sigma=True)[source]
Calculate the MAD of freq-time data.
- Parameters:
x (np.ndarray) – Data to filter.
mask (np.ndarray) – Initial mask.
base_size (tuple) – Size of the window to use in (freq, time) when estimating the baseline.
mad_size (tuple) – Size of the window to use in (freq, time) when estimating the MAD.
debug (bool, optional) – If True, return deviation and mad arrays as well
sigma (bool, optional) – Rescale the output into units of Gaussian sigmas.
- Returns:
mad – Size of deviation at each point in MAD units. This output may contain NaN’s for regions of missing data.
- Return type:
np.ndarray
- draco.analysis.flagging.medfilt(x, mask, size, *args)[source]
Apply a moving median filter to masked data.
The application is done by iterative filling to overcome the fact we don’t have an actual implementation of a nanmedian.
- Parameters:
x (np.ndarray) – Data to filter.
mask (np.ndarray) – Mask of data to filter out.
size (tuple) – Size of the window in each dimension.
args (optional) – Additional arguments to pass to the moving weighted median
- Returns:
y – The masked data. Data within the mask is undefined.
- Return type:
np.ndarray
- draco.analysis.flagging.p_to_sigma(p)[source]
Get the sigma exceeded by the tails of a Gaussian with probability p.
- draco.analysis.flagging.sigma_to_p(sigma)[source]
Get the probability of an excursion larger than sigma for a Gaussian.
- draco.analysis.flagging.tv_channels_flag(x, freq, sigma=5, f=0.5, debug=False)[source]
Perform a higher sensitivity flagging for the TV stations.
This flags a whole TV station band if more than fraction f of the samples within a station band exceed a given threshold. The threshold is calculated by wanting a fixed false positive rate (as described by sigma) for fraction f of samples exceeding the threshold
- Parameters:
x (np.ndarray[freq, time]) – Deviations of data in sigma units.
freq (np.ndarray[freq]) – Frequency of samples in MHz.
sigma (float, optional) – The probability of a false positive given as a sigma of a Gaussian.
f (float, optional) – Fraction of bad samples within each channel before flagging the whole thing.
debug (bool, optional) – Returns (mask, fraction) instead to give extra debugging info.
- Returns:
mask – Mask of the input data.
- Return type:
np.ndarray[bool]