GATB Library

  • GATB Library. The Genome Analysis Toolbox with de-Bruijn graph. A large part of tools developed by the GenScale team are based on this library.
    These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (e.g. metagenomes). Among them are (the full is available here:

Tools for sequencing data analyses using GATB

  • findere: simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
  • kmtricks: modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.
  • Simka and SimkaMin: Comparative metagenomics for large-scale datasets
  • DiscoSNP++ and discoSnpRAD: Reference-free small variant discovery (SNPs and indels)
  • MindTheGap: Detection and assembly of large insertion variants
  • MinYS: reference-guided genome assembly in metagenomics data
  • MTG-link: gap-filling in draft genome assemblies with linked-read data
  • short read connector: Detect similar reads from potentially large read set
  • Minia: De novo short read assembler
  • DSK: Count K-mer in sequences
  • Leon: short read compressor (now included in GATB-core)
  • Bloocoo: short read corrector
  • BCALM: Construct compacted de Bruijn graphs (unitigs)
  • de-novo pipelinede-novo assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes
  • Mapsembler2: Targeted assembly (not maintained)
  • TakeABreak: reference-free inversion discovery tool

Tools for sequencing data analyses not using GATB

  • SVJedi: Structural Variant genotyper with long read data
  • ORI: software using long nanopore reads to identify bacteria present in a sample at the strain level
  • StrainFLAIR: STRAIN-level proFiLing using vArIation gRaph
  • Comparead & Commet: comparison of metagenomic datasets
  • GASSST: long read mapper
  • PLAST: intensive bank-to-bank sequence comparison


 Protein Structure

  • A_Purva: Contact Map Overlap solver
  • MD-Jeep: Distance Geometry solver
  • CSA: Comparative Structural Alignment


  • SLICEE: parallel execution of bioinformatics workflows

Comparative Genomics

  • CASSIS: detection of rearrangement breakpoints

Permanent link to this article: