Tutorial
This tutorial is going to go through the process of generating BAO constraints from the Pathfinder data. Just kidding! We’re actually just going to generate some simulated data and the turn it into maps.
Setting up the Pipeline
Before you start, make sure you have access to the CHIME bitbucket organisation, and have set up your ssh keys for access to the bitbucket repositories from the machine you want to run the pipeline on. Unless you are working on Scinet, you’ll also want to ensure you have an account on niedermayer and your ssh keys are set up to allow password-less login, to ensure the database connection can be set up.
There are a few software pre-requesites to ensure you have installed. Obviously python is one of them, with numpy and scipy installed, but you also need to have virtualenv, allowing us to install the pipeline and it’s dependencies without messing up the base python installation. To check you have it installed try running:
$ virtualenv --help
if you get an error, it’s not installed properly so you’ll need to fix it.
With that all sorted, we’re ready to start. First step, download the pipeline repository to wherever you want it installed:
$ git clone git@github.com/radiocosmology/draco.git
Then change into that directory, and run the script mkvenv.sh:
$ cd draco
$ ./mkvenv.sh
The script will do three things. First it will create a python virtual environment to isolate the CHIME pipeline installation. Second it will fetch the python pre-requesites for the pipeline and install them into the virtualenv. Finally, it will install itself into the new virtualenv. Look carefully through the messages output for errors to make sure it completed successfully. You’ll need to activate the environment whenever you want to use the pipeline. To do that, simply do:
$ source <path to pipeline>/venv/bin/activate
You can check that it’s installed correctly by firing up python, and attempting to import some of the packages. For example:
>>> from drift.core import telescope
>>> print telescope.__file__
/Users/richard/code/draco/venv/src/driftscan/drift/core/telescope.pyc
>>> from draco import containers
>>> print containers.__file__
/Users/richard/code/draco/draco/containers.pyc
External Products
If you are here, you’ve got the pipeline successfully installed. Congratulations.
There are a few data products we’ll need to run the pipeline that must be generated externally. Fortunately installing the pipeline has already setup all the tools we need to do this.
We’ll start with the beam transfer matrices, which describes how the sky gets mapped into our measured visibilities. These are used both for simulating observations given a sky map, and for making maps from visibilities (real or simulated). To generate them we use the driftscan package, telling it what exactly to generate with a YAML configuration file such as the one below.
1config:
2 # Only generate Beam Transfers.
3 beamtransfers: Yes
4 kltransform: No
5 psfisher: No
6
7 output_directory: beams
8
9telescope:
10 type:
11 # Specify a custom class
12 class: PolarisedCylinderTelescope
13 module: drift.telescope.cylinder
14
15 freq_lower: 400.0
16 freq_upper: 410.0
17 num_freq: 5
18
19 num_cylinders: 2
20 num_feeds: 4
21 feed_spacing: 0.3
22 cylinder_width: 10.0
This file is run with the command:
$ drift-makeproducts run product_params.yaml
To simulate the timestreams we also need a sky map to base it on. The cora
package contains several different sky models we can use to produce a sky map.
The easiest method is to use the cora-makesky command, e.g.:
$ cora-makesky foreground 64 401.0 411.0 5 foreground_map.h5
which will generate an HDF5
file containing simulated foreground maps at each
polarisation (Stokes I, Q, U and V) with five frequency channels between 401.0
and 411.0 MHz. Each map is in Healpix format with NSIDE=16
. There are options
to produce 21cm signal simulations as well as point source only, and galactic
synchrotron maps.
Map-making with the Pipeline
The CHIME pipeline is built using the infrastructure developed by Kiyo in the
caput.pipeline
module. Python classes are written to perform task on the
data, and a YAML configuration file describes how these should be configured and
connected together. Below I’ve put the configuration file we are going to use to
make maps from simulated data:
1pipeline:
2 tasks:
3 - type: draco.core.io.LoadBeamTransfer
4 out: tel_and_bt
5 params:
6 product_directory: "testbeams/bt/"
7
8 - type: draco.synthesis.stream.SimulateSidereal
9 requires: tel_and_bt
10 out: sstream
11 params:
12 save: Yes
13 output_root: teststream_
14
15 - type: draco.analysis.transform.MModeTransform
16 in: sstream
17 out: mmodes
18
19 - type: draco.analysis.mapmaker.DirtyMapMaker
20 requires: tel_and_bt
21 in: mmodes
22 out: dirtymap
23 params:
24 nside: 128
25 save: Yes
26 output_root: map_dirty2_
27
28 - type: draco.analysis.mapmaker.WienerMapMaker
29 requires: tel_and_bt
30 in: mmodes
31 out: wienermap
32 params:
33 nside: 128
34 save: Yes
35 output_root: map_wiener2_
Before we jump into making the maps, let’s briefly go over what this all means.
For further details you can consult the caput
documentation on the pipeline.
The bulk of this configuration file is a list of tasks being configured. There
is a type
field where the class is specified by its fully qualified python
name (for example, the first task draco.io.LoadBeamTransfer
). To
connect one task to another, you simply specify a label for the output
of
one task, and give the same label to the input
or requires
of the other
task. The labels themselves are dummy variables, any string will do, provided it
does not clash with the name of another label. The distinction between input
and requires
is that the first is for an input which is passed every cycle
of the pipeline, and the second is for something required only at initialisation
of the task.
Often we might want to configure a task from the YAML file itself. This is done
with the params
section of each task. The named items within this section
are passed to the pipeline class when it is created. Each entry corresponds to a
config.Property
attribute on the class. For example the SimulateSidereal
class has parameters that can be specified:
class SimulateSidereal(task.SingleTask):
"""Create a simulated timestream.
Attributes
----------
maps : list
List of map filenames. The sum of these form the simulated sky.
ndays : float, optional
Number of days of observation. Setting `ndays = None` (default) uses
the default stored in the telescope object; `ndays = 0`, assumes the
observation time is infinite so that the noise is zero. This allows a
fractional number to account for higher noise.
seed : integer, optional
Set the random seed used for the noise simulations. Default (None) is
to choose a random seed.
"""
maps = config.Property(proptype=list)
ndays = config.Property(proptype=float, default=0.0)
seed = config.Property(proptype=int, default=None)
...
In the YAML file we configured the task as follows:
- type: draco.synthesis.stream.SimulateSidereal
requires: [tel, bt]
out: sstream
params:
maps: [ "testfg.h5" ]
save: Yes
output_root: teststream_
Of the three properties available from the definition of SimulateSidereal
we
have only configured one of them, the list of maps to process. The remaining two
entries of the params
section are inherited from the pipeline base task.
These simply tell the pipeline to save the output of the task, with a base name
given by output_root
.
The pipeline is run with the caput-pipeline script:
$ caput-pipeline run pipeline_params.yaml
What has it actually done? Let’s just quickly go through the tasks in order:
Load the beam transfer manager from disk. This just gives the pipeline access to all the beam transfer matrices produced by the driftscan code.
Load a map from disk, use the beam transfers to transform it into a sidereal timestream.
Select the products from the timestream that are understood by the given beam transfer manager. In this case it won’t change anything, but this task can subset frequencies and products as well as average over redundant baselines.
Perform the m-mode transform on the sidereal timestream.
Apply the map maker to the m-modes to produce a dirty map.
Apply the map maker to the generate a Wiener filtered map.
Ninja Techniques
Running on a cluster. Coming soon….