Complex scene perception: contextual estimation of dynamic interacting variables, by Sileye Ba

Monday, January 13, 2014, 2:00 to 3:00 pm, room F107, INRIA Montbonnot

Seminar by Sileye Ba, RN3D Innovation Lab, Marseille


During these last years researchers have conducted research about complex scene perception. Studying the perception scenes require modelling multiple interacting variables. Computer vision and machine learning have led to significant advances with models integrating multimodal observations (images, audio, depth…). An advantage of using observations issued from imaging sensors is they allow to acquire small scale observations from recording devices such as webcams (e.g. Meeting data recordings, videoconferences data) or large scale observations acquired from satellite imaging sensors (e.g. Geophysical variables, ocean surface temperatures and currents). In this talk we present investigations we have conducted about dynamic interacting variables modelling based on observations from webcams and satellite observations.

In the first part of the talk, we present a Bayesian model for multi person visual focus of attention and conversational event estimation in meetings using head pose observation and audio-visual contextual information. Instead recognizing individually the visual focus of attention for each meeting participant, we propose to jointly estimate the VFOA of all meeting participants in order to introduce interaction models related to group activities and conversation social dynamics. Meeting contextual information are introduced through conversational events  identifying people involved in conversations, and the projection screen activity during which affects people temporal VFOA dynamics during presentations. As interaction model, we introduced a dynamic Bayesian network which has VFOA and conversational events as hidden variables. As observations, we use people’s head orientations, speaking statuses, and the projection screen activity. 

In the second part of the talk, we present a variational data assimilation model for interacting dynamic geophysical variable estimation. Satellite imaging allows, when using multimodal sensors (infrared, optics, radar), an acquisition of global scale ocean parameters observations (sea surface temperature, ocean altimetry, and ocean surface chlorophyll concentration). According to the acquisition sensors, observations comprise spatially or temporally missing data. We conducted research about the estimation of sea surface temperature, sea surface chlorophyll, and ocean current estimation. Multi-scale spatio-temporal geophysical variable interactions are explicitly introduced in the variational data assimilation model during the reconstruction process.

Short bio: In 2002 Sileye Ba obtained a Master degree in mathematics, computer vision, and machine learning from Ecole Normale Superieure de Cachan in Paris. From September 2003 to April 2009, as a Phd student of IDIAP Research Institute with an affiliation to Ecole Polytechnique Federale de Lausanne (Switzerland) and a postdoctoral researcher of IDIAP Research the same institute, he worked on probabilistic methods for head pose tracking and human behaviour recognition from audio video data. From May 2009 to February 2013, he was as a postdoctoral researcher in the Signal and Communications Department of Telecom Bretagne in Brest (France) working on variational data assimilation methods for dynamic multi-modal ocean geophysical  variables modelling from multi modal satellite image sequences. Since March 2013, as a research and development engineer at RN3D Innovation Lab, in Marseille (France), he works on computer vision and machine learning methods for near infrared image sequence analysis.