API Overview¶

Bash¶

blocksort.sh: Sort oligomap alignments based on their numerical names.
get_lines_w_pattern.sh: Retrieve all lines with the matching pattern --pattern within the requested column (--column).
trim_id_fasta.sh: Trim the sequence ID of a FASTA file to the first white-space.

Perl¶

map_chromosomes.pl: Map/rename chromosome identifiers in a delimited text file using a tab-delimited mapping table.
sam_remove_duplicates_inferior_alignments_multimappers.pl: Removes duplicate records, then all QNAME duplicates except for the one(s) with the shortest edit distance. Optionally, multimappers (alignments of queries with the same edit distance, but different coordinates) are discarded.
sam_trx_to_sam_gen.pl: Re-maps a SAM file resulting from aligning a library of sequencing reads against a transcriptome to genomic coordinates.
sam_uncollapse.pl: Reverses the collapsing of reads with identical sequences as done with fastx_collapser (FASTX Toolkit) or similar.

Python¶

Modules¶

annotate_sam_with_intersecting_features.py: Annotate SAM alignments with their intersecting feature(s).
filter_multimappers.py: Filter miRNA reads mapped to multiple locations by indel count.
mirna_extension.py: Extend miRNA start and end coordinates and ensure name uniqueness.
mirna_quantification.py: Quantify miRNAs and corresponding isomiRs.
nh_filter.py: Filter alignments in a SAM file by NH tag.
oligomap_output_to_sam_nh_filtered.py: Transform oligomap output FASTA file to SAM keeping the best alignments.
primir_quantification.py: Tabulate bedtools intersect -wo -s output file.
validate_bedtools_intersect.py: Validation utilities for bedtools intersect -wo -s output.
validation_fasta.py: Filter FASTA files.

Classes¶

py.AnnotationException: A custom exception class for MirnaExtension class.
py.MirnaExtension: Class for updating miRNA annotated coordinates and names.
py.Fields: Class to store an alignment in its different SAM fields.
py.FileFormatError: Raised when a line or field does not match the expected format.
py.Record: A single validated bedtools intersect -wo -s record.

Functions¶

py.get_tags: Construct a custom tag for an alignment based on intersecting features.
py.main: Annotate alignments in a SAM file with intersecting feature tags.
py.parse_arguments: Command-line arguments parser.
py.parse_intersect_output: Parse bedtools intersect -wo -s output file.
py.count_indels: Count the number of indels in an alignment based on its CIGAR string.
py.find_best_alignments: Find alignments with more indels.
py.main: Filter multimappers by indels count.
py.parse_arguments: Command-line arguments parser.
py.write_output: Write the output to the standard output (STDOUT).
py.MirnaExtension.__init__: Initialize class.
py.MirnaExtension.adjust_names: Adjust miRNA attributes for uniqueness and consistency.
py.MirnaExtension.process_precursor: Extend miRNAs start and end coordinates and ensure name uniqueness.
py.MirnaExtension.set_db: Load GFF3 file into gffutils.FeatureDB.
py.MirnaExtension.set_seq_lengths: Set the reference sequence lengths.
py.MirnaExtension.update_db: Update miRNA annotations in the local database.
py.MirnaExtension.write_gff: Write features to a GFF3 file.
py.main: Extend miRNAs start/end coordinates.
py.parse_arguments: Parse command-line arguments.
py.collapsed_contribution: Get the contribution of the alignment to the overall count.
py.collapsed_nh_contribution: Get the contribution of the alignment to the overall count.
py.contribution: Get the contribution of the alignment to the overall count.
py.get_name: Get the final name for the species name.
py.main: Quantify miRNAs and corresponding isomiRs.
py.nh_contribution: Get the contribution of the alignment to the overall count.
py.parse_arguments: Command-line arguments parser.
py.write_output: Write to the output the correct miRNA type.
py.main: Filter alignments by their NH tag value.
py.parse_arguments: Parse command-line arguments.
py.eval_aln: Evaluate an alignment to add, discard or write it to the STDOUT.
py.get_cigar_md: Get the CIGAR and MD strings.
py.get_sam_fields: Create the read's alignment in SAM format.
py.main: Convert the alignments in the oligomap output file to SAM format.
py.parse_arguments: Command-line arguments parser.
py.get_contribution: Return the contribution of a single alignment.
py.get_initial_data: Get the feature name and its extension.
py.main: Tabulate bedtools intersect -wo -s output file.
py.parse_arguments: Command-line arguments parser.
py.Record.__init__: Initialize a validated record.
py.Record.cross_checks: Validate relationships across fields.
py.parse_all: Stream a file, yielding (line_number, Record) for each data line.
py.validate_first_n: Validate the first n lines of a bedtools intersect -wo -s out file.
py.compile_trim_pattern: Get a compiled regex pattern to trim at a character's first occurrence.
py.main: Filter and process a FASTA file.
py.open_fasta: Open a FASTA or FASTA.GZ for text‐mode reading.
py.parse_and_validate_arguments: Parse and validate command-line arguments.
py.trim_id: Trim a FASTA ID using the first-occurrence of any character in _pattern.
py.write_id_file: Write the final sequence IDs, one per line.

R¶

ascii_alignment_pileup.R: Generates an ASCII-style pileup of read alignments in one or more BAM files against one or more regions specified in a BED file.
gtf_exons_bed.1.1.2.R: Converts the exon entries of a GTF file to a BED file with one line per exon.
merge_tables.R: Merge miRNAs quantification tables.