Category: Video understanding

An Object Tracking in Particle Filtering and Data Association Framework, Using SIFT Features

An article published in ICDP 2011.

Authors: M. Souded, L. Giulieri and F. Bremond

The authors address the problematic of the multi-object tracking in video surveillance context with single static cameras. They propose a novel approach for multi-object tracking in a particle filtering and data association framework allowing real-time tracking and dealing with the most important challenges (1) selecting and tracking real object of interest in noisy environments (2) managing situations of occlusion. In this study they consider tracker inputs from motion detection approach (classically based on background subtraction and clustering). Particle filtering has proven very successful for non-linear and non-Gaussian estimation problems. The article presents SIFT features tracking into a particle filtering and data association. The performance of the proposed algorithm is evaluated on sequences from ETISEO, CAVIAR, PETS2001 and VS-PETS2003 datasets in order to show the improvements upon state-of-the-art methods.

Diagram of the proposed object tracking framework



Full version of the article can be downloaded at here.

Object tracking in SUP

Involved people: Duc Phu CHAU, Francois BREMOND and Monique THONNAT

SUP (Scene Unsderstanding Platform, developped by Stars team) provides an object apperance-based tracking algorithm. This tracker includes two main plugins: ParametrableF2Ftracking and LTT.

The objective of ParametrableF2Ftracking plugin is to establish object links with a sliding time window. For each detected object pair in a given temporal window of size T1, we compute the link score (i.e. instantaneous similarity). A temporary link is established between two objects when their link score is greater or equal to a predefined threshold. At the end of this stage, we obtain a weighted graph whose vertices are the detected objects in the considered temporal window and whose edges are the temporarily established links. Each edge is associated with a score representing the link score between its two vertices (see the below figure). For each detected object, we search its matched objects in a predefined radius and in a given temporal window to establish possible links so that even when one mobile object cannot be detected in some frames, it still can find its successor objects.

The graph representing the established links of the detected objects in a temporal window of size T1 frames.

The goal of the LTT plugin is to determine the trajectories of the mobile objects. For each object ot detected at instant t (called son object), we consider all its matched objects (i.e. objects with temporarily established links) in previous frames (called father objects) that do not have yet official links (i.e. trajectories) to any objects detected at the following frames. For such an object pair, we define a global score in function of link score (i.e. instantenous score) and  long-term score.

The father object having the highest global similarity is considered as a temporary father of object ot. After considering all objects at instant t, if more than one object get a same object as a temporary father, the pair father-son corresponding to the highest global score value is kept, and the link between this pair is official (i.e. become officially a trajectory segment). A mobile object is no longer tracked if it cannot establish any official links in T1 consecutive frames.

A reference of this work can be found here.

Object tracking description

Involved people: Duc Phu CHAU, Julien BADIE and Malik SOUDED

The aim of an object tracking algorithm is to generate the trajectories of objects over time by locating their positions in every frame of video. An object tracker may also provide the complete region in the image that is occupied by the object at every time instant.

Duc Phu CHAU’s work focus on control methods to adapt the tracking process to the scene variations. Julien BADIE focus on the global tracking task whose objective is to correct the object trajectories. He is also studying the approaches to make the trackers smarter and more flexible. Malik SOUDED studies the mobile object tracking for a multi-camera system.


ViSEvAl result comparison

The goal of this script is to compare different result files of the same sequence.

To run the script, use the following command :

python <resultFile1.txt> ... <resultFileN.txt>

The result files must be ViSEvAl output file generated by the ViSEvAlEvaluation binary.

The script displays useful information of the result files and the best configuration for each metrics is highlighted in green. The “Show all” switch toggles the display between global results and detected object metrics.

XML1 Viewer

XML1 Viewer is a Python script showing statistics and information about XML1 output of SUP.

To run the script, use the following command :

python <XML1file>

With only an XML1 file as input, the script displays all the detected object with the following statistics :

  • number of frames
  • total duration (last frame – first frame)
  • average 2D dimension (in pixels)
  • average 3D dimension (in meters)


If the directory  containing the sequence images is added as input, the images of each detected object appear.

Multi-sensors fusion for daily living activities recognition in older people domain

I am currently researching the use of Information and Communication Technologies as tools for preventative care and diagnosis support in elderly population. Our current approach uses accelerometers and video sensors for the recognition of instrumental daily living activities (IADL, e.g., preparing coffee, making a phone call). Clinical studies have pointed the decline of elderly performance in IADL as a potential indicator of early symptoms of a dementia case (e.g., Alzheimer’s’ patients). IADL are modeled and detected using a constraint-based generic ontology (called SCREK). This ontology allows us to describe events based on spatial, temporal, and sensors data (e.g., MotionPOD) values.

ScReK tool

SUP daily evaluation

The daily evaluation aims to guaranty the proper functioning of SUP by detecting crashes and bugs due to last commits and showing if the results of the different algorithms are improving day after day.

Evaluation is performed as follows :

  • update of SUP core and SUP plugins to the latest version
  • full compilation of SUP core and SUP plugins
  • processing of the reference sequences
  • computation of the results using ViSEvAl
  • an e-mail is sent the the sup_dlevel team with the results

The results are computed by comparing today metrics with the reference metrics computed from a default XML1 result file.

At the moment, only sequences are processed : one from ETISEO (ETI-BC-11-C1) to evaluate SUP standard processing chain (MOG segmentation, physical object constructor, F2F tracking and LTT) and one from Nice CHU (2011-01-11a) to evaluate events detection. The next sequence to be added will test person detection.

If you are working on a new plugin or improving an old one, you may want to add a sequence to the daily evaluation. In that case, you should send me :

  • the SUP parameter file
  • the context file (camera calibration)
  • the ground-truth file
  • the ViSEvAl configuration file with the metrics to be tested (optional)

Daily evaluation is usually performed everyday of the week at 0:55 AM. If you do not want to be spammed by the e-mails sent everyday, I suggest you add a filter in zimbra for that.

ViSEvAl software

ViSEvAl graphical user interface

ViSEvAl is under GNU Affero General Public License (AGPL)

At INRIA, an evaluation framework has been developed to assess the performance of Gerontechnologies and Videosurveillance. This framework aims at better understanding the added values of new technologies for home-care monitoring and other services. This platform is available to the scientific community and contains a set of metrics to evaluate automatically the performance of software given some ground-truth.


The software ViSEvAl (ViSualisation and EvAluation) provides a GUI interface to visualise results of video processing algorithms (such as detection of object of interest, tracking or event recognition). Moreover this software can compute metrics to evaluate specific tasks (such as detection, classification, tracking or event recognition). The software is composed of two binaries (ViSEvAlGUI and ViSEvAlEvaluation), and several plugins. The users can add their own plugins to define a new metric for instance.

General schema of an evaluation platform


  • OS: Linux (tested on Fedora 12) and gcc 4.4.4
  • Three libraries are mandatory: QT4 (for GUI facilities and plugin facilities), and libxerces-c (for automatic xml parser)
  • xsdcxx must be installed on your computer (for automatic xml parser)
  • FFMpeg is optional (only use in the plugin to load .ASF video)
  1. Go in the ViSEvAl directory (call SoftwareDirectory in the next)
  2. Launch the script ./ The script will create all the makefile needed by the application and the plugins, and will compile all the code. If all is ok, you will find the executables in SoftwareDirectory/bin/appli directory
  3. Type the bash command:
    setenv LD_LIBRARY_PATH $SoftwareDirectory/lib:/usr/local/lib$LD_LIBRARY_PATH (to tell to the applicatin where is the ViSEvAlLib, and the optional libs for ffmpeg)
  4. Go in the directory bin/appli
  5. Run ViSEvAlGUI for the GUI tool or run ViSEvAlEvaluation for the command line tool
  • ViSEvAlGUI In the menu: File -> Open…, select the desired .conf file
  • ViSEvAlEvaluation file.conf result.res [0-1] [0-1]
  1. file.conf the desired configuration file
  2. result.res the file where the results while be wrote
  3. [0-1] optional value 0: the results are printed for each frame, 1: only the global results are printed
  4. [0-1] the evaluation of the detection (XML1) and of the fusion (XML2) is only done on the common frames

More details

ViSEvAl overview

XSD files

The XSD files describe the XML format of the different input files for the ViSEvAl software

  • Description of the data provided by video sensor: camera detection, fusion detection and event detection data.xsd
  • Description of the data provided by non video sensor: contact sensor, wearable sensor,… sensor.xsd
  • Description of the camera parameters: calibration, position,… camera.xsd


This platform is available on demand to the scientific community (contact Annie.Ressouche @

MOG segmentation

This is the plugin from a previous PhD student Nghiem Anh Tuan.

Plugin Name– supProcessMoGSegmentation
Person Responsible
– Vasanth BATHRINARAYANAN (vbathrin)
Extra InfoThis plugin has good documentation, there is a parameter description file, thesis and description of structure of the program module located in the doc folder of the trunk directory.

Any parameter tuning, motion segmentation problem – Please feel free to ask me any time.