MAGNOME is an interdisciplinary project that addresses these challenges through a system approach that draws its strength from close collaborations between computer scientists and biologists. The core skills of the team are comparative genomics, data-mining, and formal methods.
- In comparative genomics we identify and analyse differences between genomes, in order to understand their past history and current function, and the processes that shape them.
- Our focus in data-mining and data integration is both on efficient algorithms for identifying pertinent groupings in complex data sets, and multi-scale representations of those data that admit complex queries and reasoning.
- Our long-standing work in formal methods applied to complex systems combines efficient representations of state spaces with model checking to analyze the realm of system behaviors.
Comparative genomics
The research developed in Magnome involves three axes in comparative genomics: Genome annotation, the process of associating biological knowledge to sequences; Sequence analysis using probabalistic models, notably hidden Markov models, used for syntactic analysis of macromolecular sequences; and Combinatorial methods for studying genome rearrangements and in particular for reconstructing ancestral genomes.
Data-mining and data integration
Finding meaningful patterns in biological data is the main challenge that we address. We have developed novel methods for consensus ensemble clustering, applied specifically to automatic identification of protein families for identifying gene homology; and for gene set enrichment analysis using guilt0by-association methods.
Modeling and formal methods
Constructing mathematical models of cell behavior is a key step in industrial applications mapping genotype to phenotype. Magnome develops the BioRica high-level modeling framework, that integrates discrete and continous multi-scale dynamics within the same semantics domain, while offering an easy to use and computationally efficient numerical simulator. BioRica programs are hierarchical and are based on a generic formalism that captures a range of discrete and continous formalisms and admits a precise operational semantics. BioRica models have a corresponding compositional semantics in terms of an extension of Generalized Markov Decision Processes.
For more information about MAGNOME research activity, see also :