Online Bayesian Moment Matching for Latent Dirichlet Allocation

when: may 20, 2014 from 11 am to noon

where: A00

speaker: Pascal Poupart

title: Online Bayesian Moment Matching for Latent Dirichlet Allocation

abstract: Latent Dirichlet Allocation (LDA) is one of the most important models for the unsupervised discovery of latent components such as topics, entities and relations based on unlabeled corpora. The estimation of LDA parameters is challenging since maximum likelihood training leads to a non-convex optimization problem and Bayesian learning yields a posterior that consists of a mixture of exponentially many Dirichlet components. As a result, practitioners typically resort to approximations based on Gibbs sampling, variational Bayes or expectation propagation. While these approximations often work well in practice, it is not clear how to detect convergence in Gibbs sampling and local optima may trap both variational Bayes and expectation propagation. Hence, there is a need for more robust techniques. In this talk, I will describe a Bayesian moment matching technique that performs exact inference with respect to a set of initial distributions (different from the prior). The approach starts with a partially defined prior where only the first, second and third order moments are specified. The remaining moments of the prior are selected gradually on a need basis as the data of the corpus is processed. This lazy approach allows us to select moments that ensure tractability of the computation. As a result, the algorithm scales linearly with the amount of data and performs exact Bayesian learning with respect to this incrementally specified prior. The approach will be demonstrated on a topic modeling task with social media data. In this talk, I will describe a Bayesian moment matching technique to estimate the parameters of the Latent Dirichlet Allocation (LDA) model. While Bayesian learning yields a posterior with exponentially large mixture of Dirichlets, the full posterior is rarely used in practice. A few moments (i.e, mean and variance) are often sufficient to estimate the parameters and some form of confidence. We propose a moment matching technique to compute the first and second order moments of the posterior in an online fashion. While moment matching leads to an approximation, we show that there exists a set of initial distributions (different from the prior) for which exact inference yields the same first and second order moments in the posterior. The approach compares favorably to existing parameter estimation techniques for LDA both in terms of accuracy and time on synthetic and real data from social media.

Spectral Bandits for Smooth Graph Functions

when: may 16, 2014 from 11 am

where: A00

Speaker: Tomáš Kocák

title: Spectral Bandits for Smooth Graph Functions

abstract: Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each recommended item is a node and its expected rating is similar to its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose two algorithms for solving our problem that scale linearly in this dimension. Our experiments on real-world content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens nodes evaluations.

MLSS Summer school

when: april 25, 2014 to may 4, 2014

where: iceland

The Machine Learning Summer School will take place at Reykjavik University in Reykjavik, Iceland, from April 25 to May 4, 2014.

The field of machine learning is at the intersection of computer science, statistics, mathematics, and optimization. The Machine Learning Summer School (MLSS) is a great venue for graduate students, researchers, and professionals to learn about fundamental and advanced methods of machine learning, data analysis, and inference, from theory to practice.

The Machine Learning Summer School in Reykjavik will feature an exciting program with talks from leading experts in the field.

The total variation on hypergraphs – Learning on hypergraphs revisited

when: April 4, 2014 from 11 am to noon

where: room B21

speaker: Shyam Sundar

title: The total variation on hypergraphs – Learning on hypergraphs revisited

abstract: In this talk, I present a new learning framework on hypergraphs which, unlike the existing methods, fully uses the hypergraph structure. The key element in our learning framework (for both semi-supervised learning and clustering) is a family of regularizers based on the total variation on hypergraphs. These novel regularizers exactly preserve the hypergraph cut functional and thereby yield unbiased formulations for the learning problems on
hypergraphs. I will also show how we efficiently solve, based on a primal-dual algorithm, the convex optimization problems occurring in the resulting learning problems. Experiments are presented showing better scalability and quality of our approach over the state-of-the-art.

Network‐based Prediction of Adverse Drug Events and Interactions

when: Thursday, March 20th, 2014 3:00PM-4:00PM
where: room B21 (bât. B, ave Halley)
speaker: Aurel Cami
title: Network‐based Prediction of Adverse Drug Events and Interactions
Adverse drug events (ADEs) and drug‐drug interactions (DDIs) are a serious concern for Public Health and the pharmaceutical industry. Every year, millions of patients are affected by drug adverse effects and interactions and billions of dollars are spent to treat the affected patients. Likewise, drug toxicity and adverse effects are lead causes of the high attrition rates and the decreased productivity in the pharmaceutical industry. We have recently developed integrative, network‐based predictive approaches for ADEs and DDIs. Rather than waiting for sufficient post‐marketing evidence to accumulate for a specific ADE or DDI, these predictive approaches rely on leveraging contextual information from previously known drug‐safety relationships, and thus have the potential to predict certain candidate effects and interactions earlier than they can be detected by existing pharmacovigilance methods. We will discuss the major facets of these new approaches, including the data types used to develop the predictive models, feature computation, training, scoring, simulated prospective validation, and the impact of safety data choice on the prediction performance. We will conclude with a discussion of the connections between our work and some recently developed system pharmacology predictive approaches for ADEs.

Diffusion strategies for cooperative reinforcement learning

When: February 14, 2014 from 11 AM to noon

Where: room W01

Speaker: Sergio Valcarcel

Title: Diffusion strategies for cooperative reinforcement learning

Abstract: We will introduce diffusion strategies and apply them to develop fully-distributed cooperative reinforcement learning solutions, in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment. These algorithms are especially beneficial for off-policy sampling (meaning that the agents can learn to predict the response to a behavior different from the actual policies they are following): when the individual agents can estimate the solution, cooperation increases stability and reduces bias and variance of the prediction error; but, more importantly, the network is able to approach the optimal solution even when none of the individual agents could (e.g., when the individual behavior policies restrict each agent to sample a small portion of the state space). If time permits, we will sketch a mean-square-error performance analysis that establishes convergence under constant step-size updates.

Online Clustering of Bandits (in a Social Network)

When: January 17, 2014 from 11AM to noon

Where: room B21

Speaker: Claudio Gentile

Title: Online Clustering of Bandits (in a Social Network)

Abstract: Bandit algorithms have become a standard tool for facing the the so-called exploration-exploitation
dilemma that naturally arises in learning problems with partial information. Recommendation systems
are one of the best applications where such algorithmic solutions may be of relevance. In many cases,
these applications have social components (either explicit or hidden), whose integration in the bandit
algorithms could lead to a significant performance increase. For instance, we may want to serve content
to a group of users by taking advantage of an underlying network of social relationships among them.
In this talk, I’ll review very recent research activity in the context of stochastic bandit algorithms
where the network of users is either known beforehand, or has to be inferred on the fly based on
observed data. I’ll present algorithms, associated analyses, and empirical evidence on both synthetic
and real-world data.

Mika and Fabio recaps on NIPS

When: January 10, 2014 from 11AM to noon.

Where: room B21

Semi-supervised clustering

When: November 7, 2013 from 2pm.

Where: room B21

Speaker: David Chatel

Title: Semi-supervised clustering

A Spectral framework for Un-directed Hypergraphs

When: October 4, 2013 from 11AM to noon.

Where: room B21

Speaker: Thomas Ricatte

Title: A Spectral framework for Un-directed Hypergraphs

Abstract: Undirected graphs are a natural way to formalize pairwise homophilic relationships between objects. Spectral methods based on graph Laplacians and graph Kernels allow us to leverage these relations in many efficient applications (semi-supervised learning, clustering, …). We propose a natural extension of this framework to represent interactions between groups of collaborating objects through the notion of undirected hypergraphs. We extend the notion of graph Laplacian and graph Kernel for this newly defined class and discuss the related properties (connectivity, resistance distance, links with signed graphs). We also see that the newly defined space of hypergraph Laplacians and hypergraph Kernels is a natural convex relaxation of the space of graph Kernels. Thus in many cases, combining graph kernels will lead to a direct interpretation in terms of undirected hypergraph.