Skip to content

script sam_uncollapse.pl

Reverses the collapsing of reads with identical sequences as done with fastx_collapser (FASTX Toolkit) or similar.

Reads and writes files in SAM format. Each line is printed n times, where n is the suffix appended to the read/query name via a dash (-).

CAUTION

Only marginal validation of the input file type/format performed!

Usage

perl sam_uncollapse.pl [OPTIONS] --in [FILE|SAM] --out [FILE|SAM]

Arguments

  • --in [FILE|SAM] (required): Path to the input SAM file.
  • --out [FILE|SAM] (required): Path to the output SAM file.

Options

  • --suffix: Add serial number suffix to each QNAME during uncollapsing (separated by a ".") to allow distinction of multimappers by QNAME.
  • -h | --help: Show this information and die.
  • -u | --usage: Show this information and die.
  • --quiet: Shut up!

Requirements

  • Perl version: >= 5.40.2
  • Modules:
    • Getopt::Long: >= 2.58

subroutine usage

Returns usage information for current script

Accepts

N/A

Returns

String with usage information

Type

Specialized


subroutine sam_uncollapse

For each line of a SAM file, parses the identifier QNAME for the presence of a number n appended to its end via a dash ('-') and re-writes the line n times.

Header lines are reproduced as they are.

Accepts

  1. Input file [FILE|SAM]
  2. Output file [FILE|SAM]
  3. Suffix switch: 0 = Do not add serial number suffix, 1 = Add serial number suffix

Type

Generic