The main focus of MODAL is to design generative models dealing with
complex multivariate and/or heterogeneous data. Typical instances of such data are
- nominal covariables for the multivariate setting,
- and the combination of continuous and nominal variables for
the heterogeneous setting.
Obviously, other widespread complex covariables are of interest such as ordinal, ranks, and intervals data.
From these generative models, a convenient and efficient statistical analysis remains
to be carried out, leading to data analysis (visualization, clustering) and data learning
(supervised and semi-supervised classification, density estimation).
MODAL is focused on generative models, that is models describing the generation process
of data, unlike predictive models.
Generative models are of great interest. On the one hand, they are required in several
statistical objectives such as clustering, semi-supervised classification, and density
estimation, where predictive models are useless.
On the other hand, these models enable data visualization. Indeed, they provide a full
description of the data distribution, which gives access to several aspects of the data
such as high density areas for instance.
In supervised classification, generative and predictive models directly compete with one
another. However, the lack of flexibility of the generative approach, as opposed to the
predictive one, is completely balanced by the use of model selection.
In addition, among generative approaches, parametric ones such as mixture models are
preferred. Provided parameters are meaningful and parsimonious, mixture models allow
valuable data interpretation.
International and industrial relations
- PGXIS UK, PharmacoGenomic Innovative Solutions
- Institut Pasteur de Paris
- IBL, Institut Biologique de Lille