Skip to content

ZARP logo

ZARP-cli

Welcome to the ZARP-cli documentation pages!

ZARP-cli is a simple command-line interface for the ZARP workflow for RNA-Seq analysis.

Sounds boring? Well, ZARP-cli doesn't just trigger ZARP, but rather supercharges it by providing the following features:

Automatically download samples from the Sequence Read Archive (SRA)

Automatically download genome annotations with genomepy

Automatically infer metadata with HTSinfer (experimental!)

Manage ZARP run data and resources in one central, configurable location

Once ZARP-cli is installed and configured, you may be able to ZARP an RNA-Seq library with a command like this:

zarp SRA1234567

Does it get easier than that? 🤓

Where to go from here?

Use the menu on the left or the search bar in the page header to navigate through the documentation.

How does it work?

Briefly, when a ZARP-cli run is triggered, a ZARP-cli configuration object is constructed from parsing default configuration settings and command-line options. A user-specified list of sample references of various supported types is then attached to the configuration object and dereferenced to construct a (potentially) sparse data frame of sample metadata. If necessary, this data frame of samples is then successively completed by applying various sample processor plugins that are built on tools such as genomepy and HTSinfer.

For example, if only a remote sample identifier is provided for a given sample, the sample will first be fetched from the remote database via a custom Snakemake workflow based on the SRA Toolkit. Via another custom Snakemake workflow, HTSinfer will then try to infer required metadata such as the source organism and the read orientation from the sample itself. If successful, genomepy then uses the source organism information to fetch the corresponding genome and gene annotations and further amends the sample data frame with this information. At this point, if any metadata is still missing, defaults from the user configuration are applied or dummy data appended, if possible/sensible. If at the end of this process enough information is available to start a ZARP run, the sample will be analyzed.

How to cite

If you use ZARP in your work (with or without ZARP-cli), please kindly cite the following article:

ZARP: An automated workflow for processing of RNA-seq data
Maria Katsantoni, Foivos Gypas, Christina J. Herrmann, Dominik Burri, Maciej Bak, Paula Iborra, Krish Agarwal, Meric Ataman, Anastasiya Börsch, Mihaela Zavolan, Alexander Kanitz
bioRxiv 2021.11.18.469017
https://doi.org/10.1101/2021.11.18.469017

Training materials

Coming soon...

Info materials

Poster

ZARP-cli poster

Reach out

There are several ways to get in touch with us:

Contributors welcome!

Open source contributors are always welcome, for ZARP, ZARP-cli or any other of the Zavolab projects. Simply reach out by email to schedule an onboarding call.

Acknowledgements

Zavolab Biozentrum, University of Basel Swiss Institute of Bioinformatics