Machine Learning and Robot Perception - Bruno Apolloni et al (Eds) Part 2 pot

1 Learning Visual Landmarks for Mobile Robot Topological Navigation 17 Furthermore, certain objects are always seen with the same orientation: objects attached to walls or beams, lying on the floor on or a table, and so on With these restrictions in mind, it is only necessary to consider five of the eight d.o.f previously proposed: X, Y, X, Y, SkY This reduction of the deformable model parameter search space increases significantly computation time This simplification reduces the applicability of the system to planar objects or faces of 3D objects, but this is not a loose of generality, only a time-reduction operation: issues for implementing the full 3D system will be given along this text However, many interesting objects for various applications can be managed in despite of the simplification, especially all kind of informative panels X (X ,Y) X Y SkY Y Fig 1.10 Planar deformable model The 2D reduced deformable model is shown in Fig 1.10 Its five parameters are binary coded into any GA individual’s genome: the individual’s Cartesian coordinates (X, Y) in the image, its horizontal and vertical size in pixels ( X, Y) and a measure of its vertical perspective distortion (SkY), as shown in equation (4) for the ith individual, with G=5 d.o.f and q=10 bits per variable (for covering 640 pixels) The variations of these parameters make the deformable model to rover by the image searching for the selected object i i i b11 , b12 , i i , b1iq ; b21 , b22 , Xi C Yi i i i , b2 q ; ; bG1 , bG , i , bGq (6) SkYi For these d.o.f., a point (x0,y0) in model reference frame (no skew, sized X0, Y0), will have (x, y) coordinates in image coordinate system for a deformed model: 18 M Mata et al X X0 X SkY X 02 x y x0 y0 Y Y0 X Y (7) A fitness function is needed that compares the object-specific detail over the deformed model with the image background Again nearly any method can be used to that a0 a3 a1 a2 D a) b) c) Fig 1.11 Selected object-specific detail set (a) object to be learned, (b) possible locations for the patter-windows, (c) memorized pattern-windows following model deformation Some global detail sets were evaluated: grayscale and color distribution function, and average textureness, but they proved unable to make a precise matching and were excessively attracted by incorrect image zones Some local detail sets were then evaluated: vertical line detection and corner detection They proved the opposite effect: several very precise matchings were found, but after a very low convergence speed: it was difficult to get the model exactly aligned over the object, and fitness was low if so The finally selected detail set is composed of four small size “patternwindows” that are located at certain learned positions along the model diagonals, as shown in Fig 1.11.b These pattern-windows have a size between 10 and 20 pixels, and are memorized by the system during the learning of a new object, at learned distances (i=0,…,3) The relative distances di from the corners of the model to the pattern-windows, di = / D (8) are memorized together with its corresponding pattern-windows These relative distances are kept constant during base model deformations in the search stage, so that the position of the pattern-windows follows them, as shown in Fig 1.11.c, as equation (7) indicates The pattern-windows will Learning Visual Landmarks for Mobile Robot Topological Navigation 19 be learned by the system in positions with distinctive local information, such as internal or external borders of the object Normalized correlation over the L component (equation 9) is used for comparing the pattern-windows, Mk(x,y), with the image background, L(x,y), in the positions fixed by each individual parameters, for providing an evaluation of the fitness function L x i, y i rk x, y L x i, y i j L M k i, j Mk j j L j M k i, j i k x, y Mk ; (9) j max rk x, y , Normalized correlation makes fitness estimation robust to illumination changes, and provides means to combine local and semi-global range for the pattern-windows First, correlation is maximal exactly in the point where a pattern-window is over the corresponding detail of the object in the image, as needed for achieving a precise alignment between model and object Second, the correlation falls down as the pattern-window goes far from the exact position, but it keeps a medium value in a small neighborhood of it; this gives a moderate fitness score to individuals located near an object but not exactly over it, making the GA converge faster Furthermore, a small biasing is introduced during fitness evaluation that speeds up convergence The normalized correlation for each window is evaluated not only in the pixel indicated by the individual’s parameters, but also in a small (around pixels) neighborhood of this central pixel, with nearly the same time cost The fitness score is then calculated and the individual parameters are slightly modified so the individual patternwindows approach the higher correlation points in the evaluated neighborhood This modification is limited to five pixels, so it has little effect on individuals far from interesting zones, but allows very quick final convergence by promoting a good match to a perfect alignment, instead of waiting for a lucky crossover or mutation to this The fitness function F([C]i) used is then a function of the normalized correlation of each pattern-window k([C]i), (0< v

Machine Learning and Robot Perception - Bruno Apolloni et al (Eds) Part 2 pot

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan