A 2-year post-doctoral fellowship is available at the Inria Genscale team in Rennes (France) to work on the development of novel computational methods for the detection and analysis of structural variation with linked-read sequencing data with Dr Claire Lemaitre and colleagues (http://people.rennes.inria.fr/Claire.Lemaitre/).
Linked-read sequencing technology, such as 10X genomics, is a very promising technology for many genomic studies. It marries the high throughput and accuracy of short read Illumina sequencing with long-range information. Reads that have been sequenced from the same long (around 30-50 Kb) DNA molecule can be identified thanks to a small barcode sequence. Taking advantage of such long-range information to improve de novo genome assembly and structural variation analyses is currently a hot topic in the bioinformatics community. Few methods have already been developed so far, but all for the only purpose of human genetics studies. Here, we will address these issues in the context of non-model organism data, high levels of heterozygosity, and challenging SV types such as inversions.
The postdoctoral researcher will be in charge of developing such novel methods dedicated to linked-reads data. This will include developing novel indexing data structures to store and rapidly access barcode information, novel algorithms to detect various types and sizes of Structural Variants, and to assemble their breakpoints for genotyping purposes. The postdoctoral researcher will apply and validate the methods on simulated and real datasets. In particular, he/she will have at disposal real re-sequencing data of butterfly genomes where structural variants are well characterized and of high interest for ecological studies.
Inria Genscale is a major bioinformatics team in Rennes, dedicated to the development of efficient and scalable methods for processing and analysing sequencing data. The team has developed several well known and used software, such as the low-memory assembler minia and the variant discovery tools MindTheGap and discoSnp++. The team maintains also the Genome Assembly Tool Box (http://gatb.inria.fr), a C++ library and full environment for developing rapidly and efficiently new software based on assembly data structures.
This proposal is part of a broader research project studying the genomics and demography of inversion polymorphism in tropical butterflies in collaboration with ecologists from the CNRS Centre for Evolutionary and Functional Ecology (CEFE, Dr Mathieu Joron) and population genetic modellers from the Natural History Museum in Paris (Dr Violaine Llaurens). In the tropical butterfly Heliconius numata, chromosomal structural polymorphism plays a major role in adaptive evolution and population biology. The project is based on a large sequencing dataset, including 12 butterfly individuals with distinct SV patterns that have already been sequenced with the 10X genomics linked-reads technology. These butterfly data is a gold mine for SV method development, where the already known inversions can be used as validation purpose, and results of the developed methods will be directly analysed and put in perspective in terms of evolutionary and ecological implications by ecologist collaborators of the project.
Requirements and application
Candidates should have a PhD or equivalent in computational biology, and experience with high throughput sequencing data, sequence and graph algorithmics, and software development. Development experience with the C++ language will be appreciated.
Starting date is flexible but should be around the summer of 2020.
Informal enquiries are encouraged. Please contact Claire Lemaitre, claire [dot] lemaitre [at] inria.fr
Applicants should send a full CV, a cover letter explaining their motivation, experience, and achievements, at least 2 references with contact information, and their date of availability.