Calendar

Events in February–March 2018

  • Keynote

    Category: Seminars Keynote


    February 1, 2018

  • Datamove workshop

    Category: Seminars Datamove workshop

    N/A
    February 8, 2018

  • Untitled Event

    Category: Seminars Untitled Event

    N/A
    February 15, 2018

    Parallel Sequence Alignment of Whole Chromosomes with Hundreds of GPUs and Pruning (by Alba Cristina Magalhaes Alves de Melo, University of Brasilia )

    Category: Seminars Parallel Sequence Alignment of Whole Chromosomes with Hundreds of GPUs and Pruning (by Alba Cristina Magalhaes Alves de Melo, University of Brasilia )


    February 15, 2018

    Biological Sequence Alignment is a very basic operation in
    Bioinformatics used routinely worldwide. Smith-Waterman is the exact
    algorithm used to compare two sequences, obtaining the optimal alignment
    in quadratic time and space. In order to accelerate Smith-Waterman, many
    GPU-based strategies were proposed in the literature. However, aligning
    DNA sequences of millions of characters, or Base Pairs (MBP), is still a
    very challenging task. In this talk, we discuss related work in the area
    of parallel biological sequence alignment and present our multi-GPU
    strategy to align DNA sequences with up to 249 millions of characters in
    384 GPUs. In order to achieve this, we propose an innovative speculation
    technique, which is able to parallelize a phase of the Smith-Waterman
    algorithm that is inherently sequential. We combined our speculation
    technique with sophisticated buffer management and fine-grain linear
    space matrix processing strategies to obtain our parallel algorithm. As
    far as we know, this is the first implementation of Smith-Waterman able
    to retrieve the optimal alignment between sequences with more than 50
    millions of characters. We will also present a pruning technique for one
    GPU that is able to prune more than 50% of the Smith-Waterman matrix and
    still retrieve the optimal alignment. We will show the results obtained
    in the Keeneland cluster (USA), where we compared all the human x
    chimpanzee homologous chromosomes (ranging from 26 MBP to 249 MBP). The
    human_chimpanzee chromosome 5 comparison (180 MBP x 183 MBP) attained
    10.35 TCUPS (Trillions of Cells Updated per Second) using 384 GPUs. In
    this case, we processed 45 petacells, being able to produce the optimal
    alignment in 53 minutes and 7 seconds, with a speculation hit ratio of
    98.2%.

    Short Bio: Alba Cristina Magalhaes Alves de Melo obtained her PhD degree
    in Computer Science from the Institut National Polytechnique de Grenoble
    (INPG), France, in 1996. In 2008, she did a postdoc at the University of
    Ottawa, Canada; in 2011, she was invited as Guest Scientist at
    Université Paris-Sud, France; and in 2013 she did a sabbatical at the
    Universitat Polytecnica de Catalunya, Spain. Since 1997, she works at
    the Department of Computer Science at the University of Brasilia (UnB),
    Brazil, where she is now a Full Professor. She is also a CNPq Research
    Fellow level 1D in Brazil. She was the Coordinator of the Graduate
    Program in Informatics at UnB for several years (2000-2002, 2004-2006,
    2008, 2010, 2014) and she coordinated international collaboration
    projects with the Universitat Politecnica de Catalunya, Spain (2012,
    2014-2016) and with the University of Ottawa, Canada (2012-2015). In
    2016, she received the Brazilian Capes Award on “Advisor of the Best PhD
    Thesis in Computer Science”. Her research interests are High Performance
    Computing, Bioinformatics and Cloud Computing. She advised 2 postdocs, 4
    PhD Thesis and 22 MsC Dissertations. Currently, she advises 4 PhD
    students and 2 MsC students. She is Senior Member of the IEEE Society
    and Member of the Brazilian Computer Society. She gave invited talks at
    Universitat Karlshure, Germany, Université Paris-Sud, France,
    Universitat Polytecnica de Catalunya, Spain, University of Ottawa,
    Canada and at Universidad del Chile, Chile. She has currently 91 papers
    listed at DBLP
    (www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Melo:Alba_Cristina_Magalhaes_Alves_de.html).

  • Seminar recess (vacation)

    Category: Seminars Seminar recess (vacation)

    N/A
    February 22, 2018

  • Keynote

    Category: Seminars Keynote

    N/A
    March 1, 2018

  • Randomized Load Balancing: Asymptotic Optimality of Power-of-d-Choices with Memory by Jonatha Anselmi (Inria Bordeaux)

    Category: Seminars Randomized Load Balancing: Asymptotic Optimality of Power-of-d-Choices with Memory by Jonatha Anselmi (Inria Bordeaux)


    March 8, 2018

    In multi-server distributed queueing systems, the access of stochastically arriving jobs to resources is often regulated by a dispatcher. A fundamental problem consists in designing a load balancing algorithm that minimizes the delays experienced by jobs. During the last two decades, the power-of-d-choice algorithm, based on the idea of dispatching each job to the least loaded server out of $d$ servers randomly sampled at the arrival of the job itself, has emerged as a breakthrough in the foundations of this area due to its versatility and appealing asymptotic properties. We consider the power-of-d-choice algorithm with the addition of a local memory that keeps track of the latest observations collected over time on the sampled servers. Then, each job is sent to a server with the lowest observation. We show that this algorithm is asymptotically optimal in the sense that the load balancer can always assign each job to an idle server in the large-server limit. This holds true if and only if the system load $\lambda$ is less than $1-\frac{1}{d}$. If this condition is not satisfied, we show that queue lengths are bounded by $j^\star+1$, where $j^\star\in\mathbb{N}$ is given by the solution of a polynomial equation. This is in contrast with the classic version of the power-of-d-choice algorithm, where queue lengths are unbounded. Our upper bound on the size of the most loaded server, $j^*+1$, is tight and increases slowly when $\lambda$ approaches its critical value from below. For instance, when $\lambda= 0.995$ and $d=2$ (respectively, $d=3$), we find that no server will contain more than just $5$ ($3$) jobs in equilibrium. Our results quantify and highlight the importance of using memory as a means to enhance performance in randomized load balancing.

  • Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning (by Danilo Santos, Datamove)

    Category: Seminars Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning (by Danilo Santos, Datamove)


    March 15, 2018

    Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning

    Abstract: Dynamic scheduling of tasks in large-scale HPC platforms is normally accomplished using ad-hoc heuristics, based on task characteristics, combined with some backfilling strategy. Defining heuristics that work efficiently in different scenarios is a difficult task, specially when considering the large variety of task types and platform architectures. In this work, we present a methodology based on simulation and machine learning to obtain dynamic scheduling policies. Using simulations and a workload generation model, we can determine the characteristics of tasks that lead to a reduction in the mean slowdown of tasks in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next task to execute in a queue improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in performance improvements, attesting the generalization capability of the obtained heuristics.

  • A Class of Stochastic Multilayer Networks: Percolation, Exact and Asymptotic Results by Philippe Nain (inria, Lyon)

    Category: Seminars A Class of Stochastic Multilayer Networks: Percolation, Exact and Asymptotic Results by Philippe Nain (inria, Lyon)


    March 22, 2018

    Abstract:
    In this talk, we will introduce a new class of stochastic multilayer networks. A stochastic multilayer network is the aggregation of M networks (one per layer) where each is a subgraph of a foundational network G. Each layer network is the result of probabilistically removing links and nodes from G. The resulting network includes any link that appears in at least K layers. This model, which is an instance of a non-standard site-bond percolation model, finds applications in wireless communication networks with multichannel radios, multiple social networks with overlapping memberships, transportation networks, and, more generally, in any scenario where a common set of nodes can be linked via co-existing means of connectivity. Percolation, exact and asymptotic results will be presented.

    Bâtiment IMAG (442)
    Saint-Martin-d'Hères, 38400
    France
  • Parallel Space-Time Kernel Density Estimation By Erik Saule (U. Caroline du Nord)

    Category: Seminars Parallel Space-Time Kernel Density Estimation By Erik Saule (U. Caroline du Nord)


    March 28, 2018

    The exponential growth of available data has increased the need for
    interactive exploratory analysis. Dataset can no longer be understood
    through manual crawling and simple statistics. In Geographical
    Information Systems (GIS), the dataset is often composed of events
    localized in space and time; and visualizing such a dataset involves
    building a map of where the events occurred.

    We focus in this paper on events that are localized among three
    dimensions (latitude, longitude, and time), and on computing the first
    step of the visualization pipeline, space-time kernel density
    estimation (STKDE), which is most computationally expensive. Starting
    from a gold standard implementation, we show how algorithm design and
    engineering, parallel decomposition, and scheduling can be applied to
    bring near real-time computing to space-time kernel density
    estimation. We validate our techniques on real world datasets
    extracted from infectious disease, social media, and ornithology.

    Bâtiment IMAG (442)
    Saint-Martin-d'Hères, 38400
    France
  • Polyhedral Optimization at Runtime, by Manuel Selva.

    Category: Seminars Polyhedral Optimization at Runtime, by Manuel Selva.


    March 29, 2018

    The polyhedral model has proven to be very useful to optimize and parallelize a particular class of compute intensive application kernels. A polyhedral optimizer needs to have affine functions defining loop bounds, memory accesses and branching conditions. Unfortunately, this information is not always available at compile time. To broaden the scope of polyhedral optimization opportunities, runtime information can be considered. This talk will highlight the challenges of integrating polyhedral optimization in runtime systems:

    - When and how to detect opportunities for polyhedral optimization?
    - How to model the observed runtime behavior in a polyhedral fashion?
    - How to deal at runtime with the complexity of polyhedral algorithm?

    These challenges will be illustrated in the context of both the APOLLO framework targeting C and C++ applications and of the JavaScript engine from Apple.

    Bâtiment IMAG (442)
    Saint-Martin-d'Hères, 38400
    France

Comments are closed.