ZARP logo

ZARP-cli¶

Welcome to the ZARP-cli documentation pages!

ZARP-cli is a simple command-line interface for the ZARP workflow for RNA-Seq analysis.

Sounds boring? Well, ZARP-cli doesn't just trigger ZARP, but rather supercharges it by providing the following features:

Automatically download samples from the Sequence Read Archive (SRA)

Automatically download genome annotations with genomepy

Automatically infer metadata with HTSinfer (experimental!)

Manage ZARP run data and resources in one central, configurable location

Once ZARP-cli is installed and configured, you may be able to ZARP an RNA-Seq library with a command like this:

zarp SRA1234567

Does it get easier than that?

Where to go from here?

Use the menu on the left or the search bar in the page header to navigate through the documentation.

How does it work?¶

"Any sufficiently advanced technology is indistinguishable from magic."
— Arthur C. Clarke

At the risk of demystifying the magic, let's take a look at how ZARP-cli works:

Briefly, when the program is triggered, a ZARP-cli configuration object is constructed from parsing default configuration settings and command-line options. A user-specified list of sample references of various supported types is then attached to the configuration object and de-referenced to construct a (potentially) sparse data frame of sample metadata. If necessary, this data frame of samples is then successively completed by applying various sample processor plugins that are built on tools such as genomepy and HTSinfer.

For example, if only a remote sample identifier is provided for a given sample, the sample will first be fetched from the remote database via a custom Snakemake workflow based on the SRA Toolkit. Via another custom Snakemake workflow, HTSinfer will then try to infer required metadata such as the source organism and the read orientation from the sample itself. If successful, genomepy then uses the source organism information to fetch the corresponding genome and gene annotations and further amends the sample data frame with this information. At this point, if any metadata is still missing, defaults from the user configuration are applied or dummy data appended, if possible/sensible. If at the end of this process enough information is available to start a ZARP run, the sample will be analyzed.

How to cite¶

If you use ZARP in your work (with or without ZARP-cli), please kindly cite the following article:

ZARP: A user-friendly and versatile RNA-seq analysis workflow
Maria Katsantoni, Foivos Gypas, Christina J. Herrmann, Dominik Burri, Maciej Bak, Paula Iborra, Krish Agarwal, Meric Ataman, Máté Balajti, Noè Pozzan, Niels Schlusser, Youngbin Moon, Aleksei Mironov, Anastasiya Börsch, Mihaela Zavolan, Alexander Kanitz
F1000Research 2024, 13:533
https://doi.org/10.12688/f1000research.149237.1

Download BibTeX citation

Info materials¶

Posters¶

Reach out¶

There are several ways to get in touch with us:

For ZARP usage questions, please use the ZARP Q&A forum (requires GitHub registration).
For feature suggestions and bug reports, please use either the ZARP-cli or ZARP issue tracker (require GitHub registration).
For any other requests, please reach out to us via email.

Contributing¶

We always welcome and duly acknowledge open source contributors, for ZARP-cli, ZARP or any other of our projects. Simply follow our onboarding instructions and please mind our Code of Conduct. If you have any questions, do not hesitate to shoot us us an email.

ZARP-cli¶

How does it work?¶

How to cite¶

Info materials¶

Posters¶

Reach out¶

Contributing¶

Acknowledgements¶