Work done in 2018
In the first year, we started addressing the two questions of modelling and understanding the system biology of communities on one hand, and of modelling and understanding the co-evolutionary aspects present in such communities on the other. A number of topics were covered concerning these two main questions.
On the first issue, we needed first to improve the method (see here) we had previously developed, already in collaboration between France and Portugal, to identify a consortium of organisms that is best for the production of metabolic compounds of interest (this method had also led to a software, MultiPus, available here). More precisely, we needed to improve the model in order for it to take into account both stoichiometry, and also the possibility of having more than just one objective to optimise.
We decided to do this initially in a simpler context of just one species, but taking into account stoichiometry and multiple objectives.
In parallel to this, we also started addressing the problem where the objectives to be reached involve dealing with toxicity. This will be the case in particular when some of the compounds of interest for which a micro-organism was genetically manipulated in order to produce it, are toxic for the micro-organism. This was done in the context of the PhD of Irene Ziska.
On the topic of co-evolution, we started trying to improve this time the work presented in a previous paper of the French team (see here). The computational method associated to that publication (called Coala and available here) enabled to estimate the costs to be associated to different co-evolutionary events given a host and a symbiont pair of trees given as input. The method was based on an Approximate Bayesian Computation approach that involved generating a high number of simulated symbiont trees in order to do such cost estimation, many of which had to be filtered out. This at the same time made the method less efficient and thus unable to deal with bigger trees, and potentially more prone to a wrong estimation of such costs. Together with Mário Figueiredo, we thus started working on improving the estimation of the costs of the events considered, and also in improving the co-evolution model itself by taking into account events that were not considered before and for which no good model currently exists in the literature. The main one being considered already is related to the fact that a symbiont may be associated to more than one host.
In parallel to this, and again in a work involving Mário Figueiredo, we have started working on the problem of finding a way of clustering the many optimal mappings of the symbiont tree to the host tree that are in general found, given a cost vector.
Work done in 2019
In the second year, we continued addressing the two questions of modelling and understanding the system biology of communities on one hand, and of modelling and understanding the co-evolutionary aspects present in such communities on the other. We also worked on the preparation of a project to be submitted to one of the H2020 program calls.
As concerns the scientific activities, the work on Topic 1 above has led to a paper that is currently in revision (Ricardo Andrade, Mahdi Doostmohammadi, João L. Santos, Marie-France Sagot, Nuno P. Mira, Susana Vinga, Momo – Multi-Objective Metabolic mixed integer Optimization: application to yeast strain engineering), and to a software, called Momo, that is available here.
Three more papers are in preparation, one on Topic 2 above; a second on inferring quantitative changes of the reactions using information on measurements of the metabolite concentrations in two steady-states that also involves Irene Ziska as PhD student; and a third on an evaluation of binning methods to recover human-gut microbial pan-genomes from a non-redundant reference gene catalog involving Marianne Borderès as PhD student.
Meanwhile, we continue working on Topic 3 above. On Topic 4 now, we decided in 2019 to go for a completely combinatorial approach of the problem that passes through the definition of different types of equivalent classes of the optimal mappings and then enumerates directly the classes without having to enumerate all the solutions before. This work is part of the PhD of Yishu Wang and a paper is also in preparation.
Work done in 2020
Except for a trip from Alexandra Carvalho to Lyon from February 18 to 21 during which we discussed of binning with Susana who had to remain in Portugal because of classes she had to teach and with Marianne Borderes, all the work we did together in 2020 had to be realised at a distance.
First the paper that was submitted in 2019 on Momo was accepted. We then advanced with the work related to the PhD of Marianne Borderes who is co-supervised by Marie-France Sagot and Susana Vinga. We submitted a first paper on the binning method that is currently under revision (Marianne Borderes, Cyrielle Gasc, Emmanuel Prestat, Mariana Galvão Ferrarini, Susana Vinga, Lilia Boucinha, Marie-France Sagot. A comprehensive evaluation of binning methods to recover human gut microbial pan-genomes from a non-redundant reference gene catalog). A second work is well advanced that concerns metabolism and gut microbiota.
We also had two main results concerning our work on cophylogeny, one that was accepted this year (Yishu Wang, Arnaud Mary, Marie-France Sagot, Blerina Sinaimeri. CAPYBARA: equivalence ClAss enumeration of coPhylogenY event-BAsed ReconciliAtions. Bioinformatics, 36(14):4197-4199) and one that is submitted (Yishu Wang, Arnaud Mary, Marie-France Sagot, Blerina Sinaimeri. Lazy listing of equivalence classes — A paper on dynamic programming and tropical circuits.). Both of them are part of the PhD work of Yishu Wang co-supervised by Mário Figueiredo, Marie-France Sagot and Blerina Sinaimeri, and fed our discussion with Mário. This discussion will continue after the end of Compasso, thanks to the H2020 Twinning project Olissipo that we submitted in 2019 with Susana Vinga and also a partner at ETH Zürich and another at EMBL at Heidelberg, was accepted in 2020 and will start in January 2021 and of which Mário is also a member.
Concerning metabolism again, we submitted a paper (Irene Ziska, Ricardo Andrade, Mariana Galvão Ferrarini, Alice Julien-Laferrière, Louis Duchemin, Roberto Marcondes César Jr., Arnaud Mary, Susana Vinga, Marie-France Sagot. TOTORO: Identifying active reactions during the transient state for metabolic perturbations) that is part of the PhD of Irene Ziska who was co-supervised by Marie-France Sagot and Susana Vinga, and who will defend her PhD in Lyon on November 24, 2020. The PhD is entitled “Models and algorithms for investigating and exploiting the metabolism of microorganisms”.
Another PhD co-supervised by Susana Vinga and Marie-france Sagot, with André Veríssimo as the student, should be defended also before the end of 2020. It is entitled “Network-based sparse regularization for the identification of disease signatures”. With André, we also have a paper in preparation (André Veríssimo, Eunice Carrasquinha, Marta B. Lopes, Arlindo L. Oliveira, Marie-France Sagot, Susana Vinga. Sparse network-based regularization for the analysis of patientomics high-dimensional survival data) that should be submitted soon.
A number of software are associated with the above papers.