Title: Mining Uncertain Data Streams.
Abstract: Dealing with uncertainty has gained increasing attention these past few years in both static and streaming data management and mining. There are many possible reasons for uncertainty, such as noise occurring when data are collected, noise injected for privacy reasons, semantics of the results of a search engine (often ambiguous),etc. Thus, many sensitive domains now involve massive uncertain data (including scientific applications). The problem is even more difficult for uncertain data streams where massive frequent updates need to be taken into account while respecting data stream constraints. In this context, discovering Probabilistic Frequent Itemsets (PFI) is very challenging since algorithms designed for deterministic data are not applicable.
In this talk, I will present our recent work with Reza Akbarinia on this topic. We propose FMU (Fast Mining of Uncertain data streams), the first solution for exact PFI mining in data streams with sliding windows. FMU allows updating the frequentness probability of an itemset whenever a transaction is added or removed from the observation window. Using these update operations, we are able to extract PFI in sliding windows with very low response times.