Work done in 2018
In the first year, we started addressing the two questions of modelling and understanding the system biology of communities on one hand, and of modelling and understanding the co-evolutionary aspects present in such communities on the other.
On the first issue, we needed first to improve the method (see here) we had previously developed, already in collaboration between France and Portugal, to identify a consortium of organisms that is best for the production of metabolic compounds of interest (this method had also led to a software, MultiPus, available here). More precisely, we needed to improve the model in order for it to take into account both stoichiometry, and also the possibility of having more than just one objective to optimise.
We decided to do this initially in a simpler context of just one species, but taking into account stoichiometry and multiple objectives. This has led to a paper that is currently submitted (Ricardo Andrade, Mahdi Doostmohammadi, João L. Santos, Marie-France Sagot, Nuno P. Mira, Susana Vinga, Momo – Multi-Objective Metabolic mixed integer Optimization: application to yeast strain engineering), and to a software, called Momo, that is available here.
In short, we explore in the paper the concept of multi-objective optimisation when both continuous and integer decision variables are involved in the model. In particular, we propose a multi-objective model which may be used to suggest reaction deletions that maximise and/or minimise several functions simultaneously. The applications may include, among others, the concurrent maximisation of a bioproduct and of biomass, or maximization of a bioproduct while minimising the formation of a given by-product, two common requirements in microbial metabolic engineering. Production of ethanol by the widely used cell factory Saccharomyces cerevisiae was then adopted as a case-study to demonstrate the usefulness of the proposed approach in identifying genetic manipulations that improve productivity and yield of this economically highly relevant bioproduct.
In parallel to this, but once the initial model for Momo had already been established, we started considering an extension of the above work when this time more than one species is considered. This is part of the work done by Irene Ziska in her PhD co-supervised by Marie-France Sagot and Susana Vinga (PhD grant is funded by Inria). In parallel to extending the initial Momo model to a community of species, Irene has been also working on again a case that involves just one species but where the objectives to be reached involve dealing with toxicity. This will be the case in particular when some of the compounds of interest for which a micro-organism was genetically manipulated in order to produce it, are toxic for the micro-organism.
On the topic of co-evolution, work has progressed on trying to improve this time the work presented in a previous paper of the French team (see here). The computational method associated to that publication (called Coala and available here) enabled to estimate the costs to be associated to different co-evolutionary events given a host and a symbiont pair of trees given as input. The method was based on an Approximate Bayesian Computation approach that involved generating a high number of simulated symbiont trees in order to do such cost estimation, many of which had to be filtered out. This at the same time made the method less efficient and thus unable to deal with bigger trees, and potentially more prone to a wrong estimation of such costs. Together with Mário Figueiredo we thus started working on improving the estimation of the costs of the events considered, and also in improving the co-evolution model itself by taking into account events that were not considered before and for which no good model currently exists in the literature. The main one being considered already is related to the fact that a symbiont may be associated to more than one host.
In parallel to this, and again in a work involving Mário Figueiredo, we have started working on the problem of finding a way of clustering the many optimal mappings of the symbiont tree to the host tree that are in general found, given a cost vector.