Software – COMPO

nlml_onco
- Functional description:
  
  This software analyses multiple data arising from clinical oncology (routine care and clinical trials). This data can be of two types:
  
  \begin{itemize}
  \item Static (e.g., baseline features), from clinical, biological, molecular (e.g., transcriptomic or mutation data)
  \item Longitudinal (multiple time points per individual): tumor kinetics, biomarkers. The second type is modeled using the framework of nonlinear mixed-effects modeling. All features are then analyzed using data science techniques (preprocess, feature selection, machine learning algorithms), in order to predict survival outcome.
  \end{itemize}
- Scientific description:
  
  \begin{itemize}
  \item Exploratory data analysis
  \item Automated mixed-effects modeling analysis
  \item Preprocess
  \item Analysis of high dimensional transcriptomic data
  \item Dimension reduction
  \item Feature selection
  \item Survival machine learning algorithms
  \item Applications (+ 3,000 patients)
  \begin{itemize}
  \item individual predictions of survival
  \item predictions of hazard ratios for applications in drug development, for both monotherapy and combinatorial treatments
  \end{itemize}
  \end{itemize}
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Structures:
  
  COMPO
- Website:
  
  https://gitlab.inria.fr/benzekry/nlml_onco
compo.EDA
- Functional description:
  
  The package compoEDA aims to provide a comprehensive exploratory analysis of data from clinical studies in oncology. These studies commonly investigate biological markers able to reveal and distinguish different tumor profiles, in order to early adapt the therapeutic strategy for patients.
  
  The objective of this software is to provide a simplified tool for both computational scientists and clinical researchers to easily generate agraphical results and automatic reports containing the following analyses:
  \begin{itemize}
  \item Overview and visualization of clinical data and biological markers
  \item Overview and visualization of clinical data and biological markers
  \item Univariate and multivariate classification analysis (logistic regression)
  \item Univariate and multivariate survival analysis (Cox regression, Kaplan-Meier analysis)
  \item Correlation analysis
  \item Statistical tests
  \item Visualization of markers (boxplots, barplots, volcano plots, forest plots …).
  \end{itemize}
- Scientific description:
  
  This library implements as an R package:
  \begin{itemize}
  \item Exploratory analysis:
  \begin{itemize}
  \item Clinical characteristics table
  \item Kaplan-Meier estimation of the progression-free and overall survival
  \item Clinical and biological features distribution
  \end{itemize}
  
  \item Classification analysis:
  \begin{itemize}
  \item Univariate and multivariate logistic regression
  \item Odds ratio
  \item Area under ROC curve
  \item t test / chi2 test
  \end{itemize}
  
  \item Survival analysis:
  \begin{itemize}
  \item Univariate and multivariate Cox regression
  \item Hazard ratio
  \item Area under ROC curve
  \item log-rank test
  \end{itemize}
  
  \item Data visualization:
  \begin{itemize}
  \item Correlation plots (Pearson correlation)
  \item Volcano plots (p-value and adjusted p-value)
  \item Boxplots (quantitative features) and barplots (qualitative feaures)
  \item Kaplan-Meier curves
  \item Automatic comprehensive and customizable statistical reports
  \end{itemize}
  \end{itemize}
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr), Linh Nguyen Phuong (linh.nguyen-phuong@inria.fr), Celestin Bigarre (celestin.bigarre@inria.fr), Paul Dufosse (paul.dufosse@inria.fr), Melanie Karlsen (melanie.karlsen@inria.fr)
- Structures:
  
  COMPO
- Website:
  
  https://team.inria.fr/compo/
ml.tidy
- Functional description:
  
  This package provides multiple functions to perform machine learning analysis using the `tidymodels` framework. Tasks include: feature selection, plot feature importances, train, corss-validate or apply supervised machine learning algorithms (classification or survival analyses), evaluate metrics of predictive performances, compute learning curves.
  
  Initial development was part of the `stats_pioneer` package (also called `pioneerPackage`) and `ml.tidy` evolved as a standalone package only in February 2023.
- Scientific description:
  
  This software maximizes the use of the R package tidymodels.
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
- Structures:
  
  COMPO
- Website:
  
  https://gitlab-int.inria.fr/compo/ml.tidy
stats_pioneer
- Functional description:
  
  This software was built to analyse the PIONeeR (Precision Immuno-Oncology for advanced Non-small cell lung cancer patients with PD-(L) 1 ICI Resistance) data. PIONeeR is a prospective, multicenter study with primary objective being to validate the existence of a hypothetical immune profile explaining resistance to immunotherapy in non-small cell lung cancer patients.
  
  It initially integrated preprocessing, exploratory data analysis, visualization, statistical analysis, feature selection, machine learning and results generation and reporting. Since, exploratory data analysis, visualization and statistical analysis have been promoted to the COMPO-level `compoEDA` package and feature selection and machine learning to the COMPO-level `ml.tidy` package.
  
  This software corresponds to the very first step of the data analysis, which is the preprocessing, and the very last: generation of results. Some of its functions aim at:
  \begin{itemize}
  \item preprocessing the data (creation of clinical variables, dictionary, outcome variables, data monitoring and correcctions, treatment of the variables types)
  \item generating the tools to load the data and metadata
  \item computing statistical tests, logistic or Cox regression, or performing a correlation analysis
  \item visualising the data (boxplots, barplots, survival curves, ROC curves, volcano plots)
  \item providing detailed and interactive statistical reports on the data
  \item automating the production of these reports using Gitlab CI/CD
  \end{itemize}
- Scientific description:
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
- Structures:
  
  COMPO
- Website:
  
  https://gitlab-int.inria.fr/pioneer/pioneer
metamats
- Functional description:
  
  This R package is the implementation of a general framework to build and use models of the metastatic process based on the initial model of Iwata et al. (2000). The family of model that can be built describe the metastatic disease with a partial differential equation (pde) on the size structured distribution of the tumors. These models have three components, a function that characterize the growth of the primary tumor, a function that characterize the growth of the metastases, and a dissemination function that decribes how new metastases are produced.
- Scientific description:
- Privileged contact:
  
  Celestin Bigarre (celestin.bigarre@inria.fr)
- Participants:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr), Celestin Bigarre (celestin.bigarre@inria.fr)
- Structures:
  
  COMPO
- Website:
  
  https://gitlab.inria.fr/cbigarre/metamats
SChISModeling
- Functional description:
  
  SChISModeling aims to analyze SChISM data (Size CfDNA Immunotherapies Signature Monitoring). SChISM is a clinical study that introduces an innovative approach to quantify circulating free DNA in cancer patients treated with immunotherapy. The study's objective is to early predict response to immunotherapy in patients at an advanced/metastatic stage according to these quantitative cfDNA data.
  
  This software corresponds to the very first step of the data analysis, which is the statistical analysis. Some of its functions aim at:
  \begin{itemize}
  \item preprocessing the data (creation of clinical variables, dictionary, outcome variables, clinical biomarkers, treatment of the variables types)
  \item computing statistical tests, logistic or Cox regression, performing a correlation analysis
  \item visualizing the data (boxplots, barplots, survival curves, ROC curves, volcano plots)
  \item providing detailed and interactive statistical reports on the data
  \end{itemize}
- Scientific description:
  
  \begin{itemize}
  \item Preprocess
  \item Exploratory data analysis
  \item Classification analysis (logistic regression)
  \item Survival analysis (Cox regression)
  \item Mixed-effects modeling analysis
  \end{itemize}
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr), Sebastien Salas (sebastien.salas@inria.fr), Linh Nguyen Phuong (linh.nguyen-phuong@inria.fr)
- Structures:
  
  COMPO
- Website:
pacaomics explorer
- Functional description:
  
  The app allows the comparison of the gene expression level vs the PAMG which is a transcriptomic signature that describes PDAC heterogeneity as a continuous gradient from pure basal-like (low PAMG) to pure classical phenotypes (high PAMG).
- Scientific description:
- Privileged contact:
  
  Abdessamad El Kaoutari (samad.elk@gmail.com)
- Participants:
- Structures:
  
  COMPO
- Website:
q_single_cell_tools
- Functional description:
  
  qSingCTools is a web application which allows the pre-processing, analysis and visualization of qPCR Single Cell data. qSingCTools takes a Gene X Cell table of CT values generated by qPCR experiments. Gene expression values were then calculated by applying the y=40-CT formulate. The count values equal to 999 (or missing values) were substituted by values generated from a Normal distribution centered on zero with a standard deviation obtained from the dataset.
- Scientific description:
- Privileged contact:
  
  Abdessamad El Kaoutari (samad.elk@gmail.com)
- Participants:
- Structures:
  
  COMPO
- Website:
  
  https://shinelka.shinyapps.io/qSingCToolsApp/
compo.NLME
- Functional description:
  
  This R package implements a framework to work with Non-linear Mixed effects models in the context of clinical oncology to predict relapse and survival using longitudinal data.
- Scientific description:
  
  Available features:
  \begin{itemize}
  \item Structural models
  \begin{itemize}
  \item constant
  \item linear
  \item double exponential
  \item double exponential with dropout
  \item hyperbolic
  \end{itemize}
  
  \item preprocess blood marker datasets
  \item preprocess tumor kinetics datasets
  \item fit NLME models using monolix API
  \item post-process of results
  \end{itemize}
  
  Available data:
  \begin{itemize}
  \item \textbf{Tumor Kinetics with dropout data.} A simulated dataset of tumor kinetics following the double-exponential model, with parameters obtained from (Benzekry et al., PAGE 20, 2022), which deals with the RECIST-based sum
  of largest diameters (SLD, in mm) of lung cancer treated with immune-checkpoint blockade (anti-PDL1 drug atezolizumab). Dropout was also simulated using a Weibull survival model.
  \item \textbf{Tumor and Blood marker Kinetics with dropout data.} A simulated dataset of joint tumor and blood markers (albumin C-reactive protein, lactate dehydrogenase, neutrophils) kinetics following the models and parameters established in (Benzekry et al., PAGE 20, 2022). These are monitoring data during immune-checkpoint blockade (anti-PDL1drug atezolizumab) in lung cancer. Dropout was also simulated using a Weibull survival model.
  \end{itemize}
- Privileged contact:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr)
- Participants:
  
  Sebastien Benzekry (Sebastien.Benzekry@inria.fr), Celestin Bigarre (celestin.bigarre@inria.fr), Ruben Taieb (ruben.taieb@inria.fr)
- Structures:
  
  COMPO
- Website: