ICRA 2016 Tutorial onVision for roboticsStockholm, SwedenFriday, May 20, 2016, 8h - 12h30, Room A1 |
François Chaumette Inria |
Peter Corke QUT |
Jana Kosecka GMU |
Eric Marchand Université de Rennes 1 |
As for humans and most animals, vision is a crucial sense for a robot
to interact within its environment. Vision for robotics has given rise
to an incredible amount of research and successful applications from
the creation of the fields of robotics and computer vision several
decades ago.
The aim of this tutorial is to provide a comprehensive state of
the art on the basic concepts, methodologies and applications. It will
thus be devoted to the modeling of visual sensors and underlying
geometry, object detection and recognition, visual tracking and 3D
localization, and visual servoing, closing the loop from perception to
action.
Note that visual SLAM, which is an important component of vision
for robotics, for exploration and navigation typically, will not be
addressed in this tutorial but in the afternoon tutorial devoted to
SLAM. The interested audience is invited to follow these two tutorials
to get a global overview of robot vision.
The tutorial consists in a set of four lectures:
08:00 - 8:05 | Introduction |
08:05 - 9:05 | Lecture 1 : Visual sensors and geometry Peter Corke, QUT |
09:05 - 10:05 | Lecture 2 : Object detection and recognition
Jana Kosecka, GMU |
10:05 - 10:20 |
Eric Marchand, Université de Rennes 1 |
Break
10:40 - 11:25 | Lecture 3 : Visual tracking (foll.) Eric Marchand, Université de Rennes 1 |
11:25 - 12:25 |
François Chaumette, Inria |
12:25 - 12:30 | Conclusion |
For any vision-based robotic system the first step is to acquire an
image and this talk will cover some important aspects of this process.
We start by considering the nature of light, color and intensity, and
how that is transduced by a camera and touch briefly on color spaces and
color-based segmentation.
We then look in more detail at the details of cameras touching on
exposure, motion blur, saturation and the constraints of rolling shutter
cameras.
Finally we look at the geometry of image formation with the pin-hole
camera model, the need for lenses, image distortion and if time permits
touch on panoramic or very wide angle cameras.
The capability to detect and recognize objects in cluttered dynamically changing environments is one key component of robot perceptual systems. This capability is used for high-level service robotics tasks (e.g. fetch and delivery of objects, object manipulation, and object search). We will overview basic formulations including (1) local descriptor approaches capturing appearance or geometry statistics and shape; (2) sliding window techniques, descriptors, and associated efficient search strategies; (3) object proposal methods that start with bottom-up segmentation followed by evaluation of classifiers. We will discuss the design choices made in each of these formulations with a focus on efficiency and the ability to handle clutter and occlusion.
Visual tracking is a key issue in the development of vision-based
robotics tasks. Once detected and recognized, objects have to be tracked
and localized in the images stream.
Beginning with the tracking of elementary geometrical features (points,
lines,...), we will consider the case where the model of tracked objects
are fully known (model-based tracking) along with the case where less
information but image intensity and basic geometrical constraints are
available (template tracking or KLT-like method).
Tracking being a spatio-temporal process, prediction and filtering
(e.g., Kalman/particle filters) are useful process for improving visual
tracking results and robustness.
The results of the tracking algorithms may then be considered within a
visual servoing control scheme.
Visual servoing consists in controlling the motions of a dynamic system in closed loop with respect to visual data. The talk will describe the different modeling steps necessary to design a vision-based control scheme and a panel of applications showing the large class of robotics tasks that can be accomplished using this approach.
P. Corke. Robotics, Vision and Control: Fundamental Algorithms in Matlab, Springer, 2011.
Y. Ma, S. Soatto, J. Kosecka, S. Sastry. An invitation to 3-d vision: from images to geometric models, Springer, 2012.
R. Szeliski, Computer Vision: Algorithms and Applications, Springer 2011
J. Ponce, M. Hebert, C. Schmid, A. Zisserman (Eds). Toward Category-Level Object Recognition, Springer, 2007
E. Marchand, H. Uchiyama, F. Spindler. Pose estimation for augmented reality: a hands-on survey. IEEE Trans. on Visualization and Computer Graphics, 2016.
F. Chaumette, S. Hutchinson. Visual servoing and visual tracking. In Handbook of Robotics, B. Siciliano, O. Khatib (eds.), Chap. 24, pp. 563-583, Springer, 2008.