Starting a pipeline

Note

These instructions are outdated and only valid for prefactor 3.2 or older. Please check the recent instructions page.

Once you have the data and the parsets ready, you can run the pipeline using the genericpipeline script, e.g.:

$ genericpipeline.py -d -c pipeline.cfg My_prefactor_calibrator.parset

Note

The -d option is recommended: it does make the log-files extremely large (many megabytes), but without it, often the important information as to why a pipeline run fails is not included.

While the pipeline runs, in the specified runtime_directory (see previous section) new files are generated in a directory named after the parset name (e.g., if you are running My_prefactor_calibrator.parset a directory named My_prefactor_calibrator will appear in your runtime_directory):

$ ls My_prefactor_calibrator/
logs mapfiles parsets statefile

The logs dir contains all the logs of the pipeline runs, identified by the date and time of execution, e.g.:

$ ls My_prefactor_calibrator/logs/2016-06-30T15:07:21/pipeline.log

These contain all the output from the processes the pipeline called and diagnostic information about the pipeline. So they are useful to follow the status of the process and possibly identify reasons why a process crashed.

While running the pipeline writes a statefile in the runtime_directory, with all the step which were successfully executed. If the pipeline stops for whatever reason, you can re-run the same command and it will skip all the steps that are already done and only work on those which are still missing.

The intermediate data files of the pipeline are written in the working_directory specified in the pipeline.cfg.

Stopping and restarting the pipeline

You can stop a pipeline run anytime by terminating the genericpipeline process (typically by pressing CRTL-C in the terminal where you started it). Sometimes some of the processes that the pipeline started don’t get properly terminated, so if the genericpipeline process doesn’t terminate you should look for its child processes and terminate them too.

Note

If you stop and re-start pipelines a number of time then you should also check occasionally if there are orphaned children that are eating up resources on you computer.

As mentioned earlier, you can re-start the pipeline by running the same command with which you started it.

Pipeline crashes

It can happen that the pipeline stops with a message like this:

ERROR   genericpipeline: LOFAR Pipeline finished unsuccesfully.
WARNING genericpipeline: recipe genericpipeline completed with errors

You need to read the log of that run to identify the reason why it stopped, e.g.:

> less My_prefactor_calibrator/logs/2016-06-30T15:07:21/pipeline.log

It is usually best to first check at the end of the file for what ended the pipeline and then search from the beginning of the file for error or diagnostic messages that tell you what exactly went wrong. See Getting help for tips on interpreting the error messages.

If you identify the problem and it does not affect the products that have been already produced, you can launch the pipeline again, after correcting the issue causing the process to stop.

Rerunning parts of the pipeline

You can fully rerun a pipeline by deleting the runtime and working directories and restarting the pipeline.

To rerun parts of the pipeline that were (allegedly) already executed successfully, you need to modify the statefile of the pipeline. To do this there is a statefile_manipulation.py script as part of prefactor:

python prefactor/bin/statefile_manipulation.py My_Workdir/My_calibrator_job/statefile

If you then run the pipeline again, it will start at the step that you removed with the statefile manipulation tool.