Commands
phantombuster
Bioinformatical tool to remove phantoms from barcode based NGS sequencing data.
The tool consists of four stages, represented by the four main commands:
demultiplex - demultiplex into samples and error correct barcodes with known sequences
error-correct - error correct barcodes with random sequences
hopping-removal - remove barcode combinations that likely originate from index hopping
threshold - remove barcode combinations with low read count
phantombuster [OPTIONS] COMMAND [ARGS]...
Options
- --version
Show the version and exit.
- --verbose
Enable verbose debugging
- -o, --outputlog <outputlog>
Output file for logs
demultiplex
Demultiplex BAM/FASTA files into parquet files
INPUT_FILE is a path to a CSV file that lists all input files
Requires additional worker processe, see ‘phantombuster worker’.
phantombuster demultiplex [OPTIONS] INPUT_FILE
Options
- --outdir <outdir>
Required Directory to save all results and temp files
- --regex-file <regex_file>
Required CSV file which specifies via regular expressions where barcodes are located
- --barcode-file <barcode_file>
Required CSV file that specifies all barcodes and their type
Arguments
- INPUT_FILE
Required argument
error-correct
Correct sequencing errors in random barcode sequences originating from single-nucleotide errors
Requires additional worker processe, see ‘phantombuster worker’.
phantombuster error-correct [OPTIONS]
Options
- --outdir <outdir>
Required Directory to save all results and temp files
- --error-threshold <error_threshold>
Maximal Hamming distance to consider two barcode sequences related
- --barcode-file <barcode_file>
Required CSV file that specifies all barcodes and their type
- --remove-ambigious, --keep-ambigious
Remove combinations with ambigious characters (NRYSWKMBDHV) after error correction
hopping-removal
Remove phantom combinations originating from index hopping
The read count of each barcode combination is compared to the expected read count under index hopping.
phantombuster hopping-removal [OPTIONS] [HOPPING_BARCODES]...
Options
- --outdir <outdir>
Required Directoy to save all results and temp files
- --threshold <threshold>
p-value threshold. Lower is more strict.
Arguments
- HOPPING_BARCODES
Optional argument(s)
merge
Merge multiple prefixes under one prefix
phantombuster merge [OPTIONS] [PREFIXES]...
Options
- --outdir <outdir>
Required
- --prefix <prefix>
- --barcode-hierarchy-file <barcode_hierarchy_file>
Required
Arguments
- PREFIXES
Optional argument(s)
threshold
Remove combinations with a read count below a threshold
phantombuster threshold [OPTIONS]
Options
- --outdir <outdir>
Required Directory to save all results and temp files
- --threshold-file <threshold_file>
Required CSV file that specifies the read threshold under which combinations are removed
to-csv
Convert a parquet file to a CSV file.
phantombuster to-csv [OPTIONS] PARQUETFILE [OUTFILE]
Arguments
- PARQUETFILE
Required argument
- OUTFILE
Optional argument
to-parquet
Convert a CSV file to a parquet file.
phantombuster to-parquet [OPTIONS] CSVFILE [OUTFILE]
Arguments
- CSVFILE
Required argument
- OUTFILE
Optional argument
worker
Start a worker process
The worker process uses IPC to connect to the server. It needs to be run on the same node and in the same working directory as the server.
phantombuster worker [OPTIONS]
Options
- --outdir <outdir>
Required
- --name <name>