Commands

phantombuster

Bioinformatical tool to remove phantoms from barcode based NGS sequencing data.

The tool consists of four stages, represented by the four main commands:

  1. demultiplex - demultiplex into samples and error correct barcodes with known sequences

  2. error-correct - error correct barcodes with random sequences

  3. hopping-removal - remove barcode combinations that likely originate from index hopping

  4. threshold - remove barcode combinations with low read count

phantombuster [OPTIONS] COMMAND [ARGS]...

Options

--version

Show the version and exit.

--verbose

Enable verbose debugging

-o, --outputlog <outputlog>

Output file for logs

demultiplex

Demultiplex BAM/FASTA files into parquet files

INPUT_FILE is a path to a CSV file that lists all input files

Requires additional worker processe, see ‘phantombuster worker’.

phantombuster demultiplex [OPTIONS] INPUT_FILE

Options

--outdir <outdir>

Required Directory to save all results and temp files

--regex-file <regex_file>

Required CSV file which specifies via regular expressions where barcodes are located

--barcode-file <barcode_file>

Required CSV file that specifies all barcodes and their type

Arguments

INPUT_FILE

Required argument

error-correct

Correct sequencing errors in random barcode sequences originating from single-nucleotide errors

Requires additional worker processe, see ‘phantombuster worker’.

phantombuster error-correct [OPTIONS]

Options

--outdir <outdir>

Required Directory to save all results and temp files

--error-threshold <error_threshold>

Maximal Hamming distance to consider two barcode sequences related

--barcode-file <barcode_file>

Required CSV file that specifies all barcodes and their type

--remove-ambigious, --keep-ambigious

Remove combinations with ambigious characters (NRYSWKMBDHV) after error correction

hopping-removal

Remove phantom combinations originating from index hopping

The read count of each barcode combination is compared to the expected read count under index hopping.

phantombuster hopping-removal [OPTIONS] [HOPPING_BARCODES]...

Options

--outdir <outdir>

Required Directoy to save all results and temp files

--threshold <threshold>

p-value threshold. Lower is more strict.

Arguments

HOPPING_BARCODES

Optional argument(s)

merge

Merge multiple prefixes under one prefix

phantombuster merge [OPTIONS] [PREFIXES]...

Options

--outdir <outdir>

Required

--prefix <prefix>
--barcode-hierarchy-file <barcode_hierarchy_file>

Required

Arguments

PREFIXES

Optional argument(s)

threshold

Remove combinations with a read count below a threshold

phantombuster threshold [OPTIONS]

Options

--outdir <outdir>

Required Directory to save all results and temp files

--threshold-file <threshold_file>

Required CSV file that specifies the read threshold under which combinations are removed

to-csv

Convert a parquet file to a CSV file.

phantombuster to-csv [OPTIONS] PARQUETFILE [OUTFILE]

Arguments

PARQUETFILE

Required argument

OUTFILE

Optional argument

to-parquet

Convert a CSV file to a parquet file.

phantombuster to-parquet [OPTIONS] CSVFILE [OUTFILE]

Arguments

CSVFILE

Required argument

OUTFILE

Optional argument

worker

Start a worker process

The worker process uses IPC to connect to the server. It needs to be run on the same node and in the same working directory as the server.

phantombuster worker [OPTIONS]

Options

--outdir <outdir>

Required

--name <name>