PLAST – command line use

Using PLAST on the command-line

PLAST is a standalone application that runs on the command-line, whatever the operating system: MacOS X, Linux or Windows. You can read more about PLAST algorithm on BMC Bioinformatics paper.

Being a general purpose sequence comparison tool, Plast provides 5 comparison methods to match a query against a reference (subject) databank: plastp, plastn, plastx, tplastx, tplastn.

The supported file formats are fasta (query and subject files) and Blast database format (subject only).

Usage examples

Basic case

PLAST software comes with a set of FASTA files that can be used to illustrate how to use the tool. The most basic command looks like:

./Plast -i ../../db/query.fa -d ../../db/tursiops.fa -p plastp -o output

Usage with progress bar

./Plast -i ../../db/query.fa -d ../../db/tursiops.fa -p plastp -o output -bargraph

Usage with e-value threshold

./Plast -i ../../db/query.fa -d ../../db/tursiops.fa -p plastp -o output -e 1e-2

Supported arguments

PLAST supports the following arguments:

  • -p: Program Name [plastp, tplastn, plastx, tplastx or plastn]
  • -d: Subject database file (fasta file or *.pal file for BLAST DB files)
  • -i: Query database file (fasta file)
  • -o: PLAST report Output File (if not provided, the output is written to a
    file called stdout in the current directory)
  • -e: Expectation value (real number, scientific notation is supported)
  • -n: Size of neighbourhood peforming ungapped extension
  • -s: Ungapped threshold trigger a small gapped extension
  • -g: threshold for small gapped extension
  • -b: bandwith for small gapped extension
  • -a: Number of processors to use
  • -G: Cost to open a gap
  • -E: Cost to extend a gap
  • -xdrop-ungap: X dropoff value for Ungapped alignment (in bits) (zero
    invokes default behavior 20 bits)
  • -X: X dropoff value for gapped alignment (in bits) (zero invokes default
    behavior)
  • -Z: X dropoff value for final gapped alignment in bits (0.0 invokes default
    behavior)
  • -index-threshold: Index threshold to calculate the similarity between
    neighbour
  • -F: Filter query sequence
  • -M: Score matrix (BLOSUM62 or BLOSUM50)
  • -strand: strands for plastn: ‘plus’, ‘minus’ or ‘both’ (default)
  • -r: reward for a nucleotide match (plastn)
  • -q: penalty for a nucleotide mismatch (plastn)
  • -force-query-order: Force queries ordering in output file.
  • -max-database-size: Maximum allowed size (in bytes) for a database. If
    greater, database is segmented.
  • -max-hit-per-query: Maximum hits per query. 0 value will dump all hits
    (default)
  • -max-hsp-per-hit: Maximum alignments per hit. 0 value will dump all hits
    (default)
  • -outfmt: Output format: 1 for tabulated (default).
  • -strands-list: List of the strands (ex: “1,2,6”) to be used when using algo
    using nucleotids databases.
  • -optim-codon-stop: size of the allowed range between the last invalid
    character and the next stop codon
  • -bargraph: Display a progress bar during execution.
  • -bargraph-size: Nb of characters of the bargraph.
  • -progression-file: Dump in a file the current execution percentage.
  • -verbose: Display information during algorithm execution.
  • -full-stats: Dump algorithm statistics.
  • -stats: Dump generic statistics.
  • -stats-fmt: Format of statistics: ‘raw’ (default) or ‘xml’
  • -stats-auto: Automatic stats file creation
  • -alignment-progress: Dump in a file the growing number of ungap/ungap
    alignments during algorithm.
  • -resources-progress: Dump in a file information about resources during
    algorithm.
  • -plastrc: Pathname of the plast config file.
  • -xmlfilter: Uri of a XML filter file.
  • -seeds-use-ratio: Ratio of seeds to be used.
  • -seeds-index-filter: seeds length to be used for the indexation filter.
  • -complete-subject-database-stats-file: File path to the stats of the
    complete subject database
  • -W: size of the seeds
  • -h: help

Permanent link to this article: https://team.inria.fr/genscale/plast-command-line-use/