Softwares

BlockCluster
- Functional description:
  
  BlockCluster is an R package for co-clustering of binary, contingency and continuous data based on mixture models.
- Website:
  
  http://cran.r-project.org/web/packages/blockcluster/index.html
cfda
- Functional description:
  
  The R package cfda performs:
  
  – descriptive statistics for categorical functional data
  
  – dimension reduction and optimal encoding of states (correspondance multiple analyses towards functional data)
  
  – approximation for multivariate categorical functional data analysis.
- Website:
  
  https://github.com/modal-inria/cfda
clere
- Functional description:
  
  The clere package for R proposes variable clustering in high dimensional linear regression. Available on
  CRAN and now submitted to an international journal dedicated to software.
- Website:
  
  https://cran.r-project.org/web/packages/clere/index.html
Clustericat
- Functional description:
  
  Clustericat is an R package for model-based clustering of categorical data. In this package, the Conditional Correlated Model (CCM), published in 2014, takes into account the main conditional dependencies between variables through extreme dependence situations (independence and deterministic dependence). Clustericat performs the model selection and provides the best model according to the BIC criterion and the maximum likelihood estimates.
- Website:
  
  https://r-forge.r-project.org/R/?group_id=1803
CoModes
- Functional description:
  
  CoModes is another R package for model-based clustering of categorical data. In this package, the Conditional Modes Model (CMM), submitted for publication in 2014, takes into account the main conditional dependencies between variables through particular modality crossings (so-called modes). CoModes performs the model selection and provides the best model according to the exact integrated likelihood criterion and the maximum likelihood estimates.
- Website:
  
  https://r-forge.r-project.org/R/?group_id=1809
CorReg
- Functional description:
  
  The main idea of the CorReg package is to consider some form of sub-regression models, some variables defining others. We can then remove temporarily some of the variables to overcome ill-conditioned matrices inherent in linear regression and then reinject the deleted information, based on the structure that links the variables. The final model therefore takes into account all the variables but without suffering from the consequences of correlations between variables or high dimension.
- Website:
  
  https://cran.r-project.org/web/packages/CorReg/index.html
FunHDDC
- Functional description:
  
  FunHDDC package for R proposes a clustering tool for functional data. The model-based clustering algorithm considers that functional data live in cluster-specific subspaces.
- Website:
  
  https://cran.r-project.org/web/packages/funHDDC/index.html
FunFEM
- Functional description:
  
  FunFEM package for R proposes a clustering tool for functional data. The model-based algorithm clusters the functional data into discriminative subspaces.
- Website:
  
  https://cran.r-project.org/web/packages/funFEM/index.html
Galaxy – MPAgenomics
- Functional description:
  
  Galaxy is an open, web-based platform for data intensive biomedical research. Galaxy features user friendly interface, workflow management, sharing functionalities and is widely used in the biologist community. The MPAgenomics R package developped by MODAL has been integrated into Galaxy, and the Galaxy MODAL instance has been publicly deployed thanks to the IFB-cloud infrastructure.
- Website:
  
  https://cloud.france-bioinformatique.fr/accounts/login/
HDPenReg
- Functional description:
  
  HDPenReg is an R-package based on a C++ code dedicated to the estimation of regression model with l1-penalization.
- Website:
  
  https://cran.r-project.org/web/packages/HDPenReg/index.html
MASSICCC
- Functional description:
  
  The MASSICCC web application offers a simple and dynamic interface for analysing heterogeneous data with a web browser. Various software packages for statistical analysis are available (Mixmod, MixtComp, BlockCluster) which allow for supervised and supervised classification of large data sets.
- Website:
  
  https://massiccc.lille.inria.fr
MetaMA
- Functional description:
  
  MetaMA is a specialised software for microarrays. It is an R package which combines either p-values or modified effect sizes from different studies to find differentially expressed genes. The main competitor of metaMA is geneMeta. Compared to geneMeta, metaMA offers an improvement for small sample size datasets since the corresponding modelling is based on shrinkage approaches.
- Website:
  
  https://cran.r-project.org/web/packages/metaMA/index.html
metaRNASeq
- Functional description:
  
  MetaRNASeq is a specialised software for RNA-seq experiments. It is an R package which is an adaptation of the metaMA package, which performs meta-analysis of microarray data. Both enable to take advantage of empirical bayesian approaches, especially appropriate in a context of high dimension. Specificities of the two types of technologies require however some adaptations to each one, explaining the development of two different packages. To facilitate their use by a large public, a Galaxy-web instance named SMAGEXP has been created and gathers the two packages.
- Website:
  
  https://cran.r-project.org/web/packages/metaRNASeq/index.html
MixAll
- Functional description:
  
  MixAll is a model-based clustering package for modelling mixed data sets. It has been engineered around the idea of easy and quick integration of any kind of mixture models for any kind of data, under the conditional independence assumption. Currently five models (Gaussian mixtures, categorical mixtures, Poisson mixtures, Gamma mixtures and kernel mixtures) are implemented. MixAll has the ability to natively manage completely missing values when assumed as random. MixAll is used as an R package, but its internals are coded in C++ as part of the STK++ library (www.stkpp.org) for faster computation.
- Website:
  
  https://cran.r-project.org/web/packages/MixAll/
MixtComp.V4
- Functional description:
  
  MixtComp (Mixture Computation) is a model-based clustering package for mixed data originating from the Modal team (Inria Lille). It has been engineered around the idea of easy and quick integration of all new univariate models, under the conditional independence assumption. New models will eventually be available from researches, carried out by the Modal team or by other teams. Currently, central architecture of MixtComp is built and functionality has been field-tested through industry partnerships. Five basic models (Gaussian, Multinomial, Poisson, Weibull, NegativeBinomial) are implemented, as well as two advanced models (Functional and Rank). MixtComp has the ability to natively manage missing data (completely or by interval). MixtComp is used as an R package, but its internals are coded in C++ using state of the art libraries for faster computation.
- Website:
  
  https://github.com/modal-inria/MixtComp
MixtComp
- Functional description:
  
  MixtComp (Mixture Computation) is a model-based clustering package for mixed data originating from the Modal team (Inria Lille). It has been engineered around the idea of easy and quick integration of all new univariate models, under the conditional independence assumption. New models will eventually be available from researches, carried out by the Modal team or by other teams. Currently, central architecture of MixtComp is built and functionality has been field-tested through industry partnerships. Three basic models (Gaussian, multinomial, Poisson) are implemented, as well as two advanced models (Ordinal and Rank). MixtComp has the ability to natively manage missing data (completely or by interval). MixtComp is used as an R package, but its internals are coded in C++ using state of the art libraries for faster computation.
- Website:
  
  https://cran.r-project.org/web/packages/RMixtComp/index.html
MixCluster
- Functional description:
  
  MixCluster is an R package for model-based clustering of mixed data (continuous, binary, integer). In this package, the model, submitted for publication in 2014, takes into account the main conditional dependencies between variables through Gaussian copula. Mixcluster performs the model selection and provides the best model according to Bayesian approaches.
- Website:
  
  https://r-forge.r-project.org/R/?group_id=1939
MPAGenomics
- Functional description:
  
  MPAgenomics provides functions to preprocess and analyze genomic data. It is devoted to: (i) efficient segmentation and (ii) genomic marker selection from multi-patient copy number and SNP data profiles.
- Website:
  
  https://cran.r-project.org/web/packages/MPAgenomics/index.html
ordinalClust
- Functional description:
  
  Ordinal data classification, clustering and co-clustering using model-based approach with the Bos distribution for ordinal data
- Website:
PACBayesianNMF
- Functional description:
  
  Implementing NMF with a PAC-Bayesian approach relying upon block gradient descent
- Website:
  
  https://github.com/astha736/PACbayesianNMF
pycobra
- Functional description:
  
  pycobra is a python library for ensemble learning, which serves as a toolkit for regression, classification, and visualisation. It is scikit-learn compatible and fits into the existing scikit-learn ecosystem.
  
  pycobra offers a python implementation of the COBRA algorithm introduced by Biau et al. (2016) for regression.
  
  Another algorithm implemented is the EWA (Exponentially Weighted Aggregate) aggregation technique (among several other references, you can check the paper by Dalalyan and Tsybakov (2007).
  
  Apart from these two regression aggregation algorithms, pycobra implements a version of COBRA for classification. This procedure has been introduced by Mojirsheibani (1999).
  
  pycobra also offers various visualisation and diagnostic methods built on top of matplotlib which lets the user analyse and compare different regression machines with COBRA. The Visualisation class also lets you use some of the tools (such as Voronoi Tesselations) on other visualisation problems, such as clustering.
- Website:
  
  https://github.com/bhargavvader/pycobra
PyRotor
- Functional description:
  
  PyRotor leverages available trajectory data to focus the search space and to estimate some properties which are then incorporated in the optimisation problem. This constraints in a natural and simple way the optimisation problem whose solution inherits realistic patterns from the data. In particular PyRotor does not require any knowledge on the dynamics of the system.
- Website:
  
  https://pypi.org/project/pyrotor/
RankCluster
- Functional description:
  
  Rankcluster package for R proposes a clustering tool for ranking data. Multivariate and partial rankings can be also taken into account.
  Rankcluster now supports tied ranking data.
- Website:
  
  https://cran.r-project.org/web/packages/Rankcluster/index.html
Rmixmod
- Functional description:
  
  MIXMOD (MIXture MODelling) is an important software for the modal team since it concerns its main
  topics: model-based supervised, unsupervised and semi-supervised classification for various data situations.
  MIXMOD is now a well-distributed software with over 250 downloads/month are recorded for several years.
  MIXMOD is written in C++ (more than 10 000 lines) and distributed under GNU General Public License.
  Several other institutions participate in the MIXMOD development since several years: CNRS, Inria Saclay-
- Website:
rtkore
- Functional description:
  
  STK++ (http://www.stkpp.org) is a collection of C++ classes for statistics, clustering, linear algebra, arrays (with an Eigen-like API), regression, dimension reduction, etc. The integration of the library to R is using Rcpp. The rtkore package includes the header files from the STK++ core library. All files contain only templated classes or inlined functions. STK++ is licensed under the GNU LGPL version 2 or later. rtkore (the stkpp integration into R) is licensed under the GNU GPL version 2 or later. See file LICENSE.note for details.
- Website:
  
  https://cran.r-project.org/web/packages/rtkore/index.html
simerge
- Functional description:
  
  Allows to perform Co-Clustering on binary (Bernoulli) and counting variables (Poisson) using co-variables.
- Website:
STK++
- Functional description:
  
  STK++ (Statistical ToolKit in C++) is a versatile, fast, reliable and elegant collection of C++ classes for statistics, clustering, linear algebra, arrays (with an API Eigen-like), regression, dimension reduction, etc. The library is interfaced with lapack for many linear algebra usual methods. Some functionalities provided by the library are available in the R environment using rtkpp and rtkore.
  
  STK++ is suitable for projects ranging from small one-off projects to complete data mining application suites.
- Website:
  
  http://www.stkpp.org
MLGL
- Functional description:
  
  The MLGL R-package, standing for Multi-Layer Group-Lasso, implements a procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data.
  The MLGL approach combines variables aggregation and selection in order to improve interpretability and performance. First, a hierarchical clustering procedure provides at each level a partition of the variables into groups. Then, the set of groups of variables from the different levels of the hierarchy is given as input to group-Lasso, with weights adapted to the structure of the hierarchy. At this step, group-Lasso outputs sets of candidate groups of variables for each value of regularization parameter.
  The versatility offered by MLGL to choose groups at different levels of the hierarchy a priori induces a high computational complexity. MLGL however exploits the structure of the hierarchy and the weights used in group-Lasso to greatly reduce the final time cost.
  The final choice of the regularization parameter – and therefore the final choice of groups – is made by a multiple hierarchical testing procedure.
- Website:
  
  https://cran.r-project.org/web/packages/MLGL/index.html

Posts

Categories

Archives