Tools

Hadoop_g5k (2013)

Hadoop_g5k is a tool that makes it easier to manage Hadoop and Spark clusters and prepare reproducible experiments in the Grid 5000 platform. Hadoop_g5k offers a set of scripts to be used in command-line interfaces and a Python API to interact with the clusters. It is currently active within the G5k community, facilitating the preparation and execution of experiments in the platform.

FP-Hadoop (2012-2013)

FP-Hadoop makes the reduce side of Hadoop MapReduce more parallel and efficiently deals with the problem of data skew in the reduce side. In FP-Hadoop, there is a new phase, called intermediate reduce (IR), in which blocks of intermediate values, constructed dynamically, are processed by intermediate reduce workers in parallel. Our experiments using FP-Hadoop using synthetic and real benchmarks have shown excellent performance gains compared to native Hadoop, e.g. more than 10 times in reduce time and 5 times in total execution time.

SON – Shared-data Overlay Network (2012-2014)

SON is a development tool for P2P networks using web services, JXTA and OSGi. The development of a SON application is done through the design and implementation of a set of components. Each component includes a technical code that provides the component services and a code component that provides the component logic (in Java). The complex aspects of asynchronous distributed programming are separated from code components and automatically generated from an abstract description of services for each component by the component generator.

Open-unmix (2019-2023)

Open-unmix implements state of the art audio/music source separation with deep neural networks using the Pytorch and Tensorflow frameworks. It is intended to serve as a reference in the domain. It comprises the code for both training and testing the separation networks, in a flexible manner. Pre- and post-processing around the actual deep neural nets include sophisticated specific multichannel filtering operations.

UMX-PRO (2021-2023)

UMX-PRO implements a full audio separation deep learning pipeline in Tensorflow v2. It provides everything needed to train and use a deep learning model for separating music signals, including network architecture, data pipeline, training code, inference code as well as pre-trained weights. The software comes with full documentation, detailed comments and unit tests.

VersionClimber (2018-)

VersionClimber is an automated system to help update the package and data infrastructure of a software application based on priorities that the user has indicated (e.g. the user cares more about having a recent version of this package). The system does a systematic and heuristically efficient exploration (using bounded upward compatibility) of a version search space in a sandbox environment (Virtual Env or conda env), finally delivering a lexicographically maximum configuration based on the user-specified priority order. It works for Linux and Mac OS on the cloud.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tools

Hadoop_g5k (2013)

FP-Hadoop (2012-2013)

SON – Shared-data Overlay Network (2012-2014)

Open-unmix (2019-2023)

UMX-PRO (2021-2023)

VersionClimber (2018-)

In this section

Search

Events

Calendar

Meta