Field and Service Robotics - Corke P. and Sukkarieh S.(Eds) Part 2 pps

Visual Motion Estimation for an Autonomous Underwater Reef Monitoring Robot Matthew Dunbabin, Kane Usher, and Peter Corke CSIRO ICT Centre, PO Box 883 Kenmore QLD 4069, Australia Summary. Performing reliable localisation andnavigation within highly unstruc- turedunderwater coralreef environments is adifficult task at thebestoftimes.Typ- ical research andcommercial underwater vehicles useexpensiveacousticpositioning andsonar systemswhich require significant external infrastructure to operateeffec- tively.Thispaper is focusedonthe development of arobustvision-basedmotion estimation technique usinglow-cost sensorsfor performing real-time autonomous anduntetheredenvironmental monitoring tasksinthe GreatBarrier Reef without theuse of acoustic positioning. Thetechnique is experimentally showntoprovide accurate odometry andterrain profile information suitable forinput into thevehicle controllertoperform arange of environmentalmonitoringtasks. 1Introduction In light of recentadvances in computing andenergy storage hardware,Au- tonomous Underwater Vehicles(AUVs)are emerging as thenextviable alternative to humandiversfor remote monitoring andsurveytasks.There are a number of remotely operated (ROV)and AUVs performing variousmonitoring tasks around theworld[17]. Thesevehicles are typically large andexpensive, requireconsiderable external infrastructure foraccuratepositioning, andneed more than onepersontooperateasingle vehicle.These vehiclesalsogener- ally avoidthe highly unstructuredreef environmentssuchasAustralia’s Great Barrier Reef,withlimited researchperformed on shallowwater applications andreef traversing.Where surveyingatgreater depths is required, ROV’s have been used forvideo transectsand biomass identification, however, these vehiclesstill requirethe humanoperator in theloop. Knowing thepositionand distanceaAUVhas movediscriticaltoensure that correct andrepeatable measurements are being takenfor reef surveying applications. It is importanttohaveaccurateodometrytoensuresurvey transect pathsare correctly followed. Anumberoftechniquesare used to estimate vehicle motion.Acousticsensors such as Dopplervelocitylogs are a commonmeans of obtainingaccuratemotioninformation. The useofvision P. Corke and S. Sukkarieh (Eds.): Field and Service Robotics, STAR 25, pp. 31–42, 2006. © Springer-Verlag Berlin Heidelberg 2006 32 M. Dunbabin, K. Usher, and P. Corke for motion estimation is becoming a popular technique for underwater use allowing navigation, station keeping, and the provision of manipulator feed- back information [16, 12, 15]. The accuracy of underwater vision is dependent on visibility and lighting, as well as optical distortion resulting from varying refractive indices, requiring either corrective lenses or careful calibration[4]. Visual information is often fused with various acoustic sensors to achieve increased sensor resolution and accuracy for underwater navigation [10]. Al- though this fusion can result in very accurate motion estimation compared to vision only, it is typically performed off-line and in deeper water applications. A number of authors have investigated different techniques for odometry estimation using vision as the primary sensor. Amidi [2] provides a detailed investigation into feature tracking for visual odometry for an autonomous helicopter. Another technique to determine camera motion is structure-from- motion (SFM) with a comparison of a number of SFM techniques in terms of accuracy and computational efficiency given by Adams[1]. Corke [7] presents experimental results for odometry estimation of a planetary rover using omnidirectional vision and compares robust optic flow and SFM methods with very encouraging results. This research is focused on autonomously performing surveying tasks based around the Great Barrier Reef using low-cost AUV’s and vision as the primary sensor for motion estimation. The use of vision in this environment is considered a powerful technique due to the feature rich terrain. However, at the same time it can cause problems for traditional processing techniques with highly unstructured terrain, soft swaying corals, moving biomass and lighting ripple due to surface waves. The focus of this paper is on the development of a robust real-time vision- based motion estimation technique for a field deployed AUV which uses intel- ligently fused low-cost sensors and hardware, and without the use of acoustic positioning or artificial lighting. 2Vision System 2.1Vehicle The vehicle developedand used in this researchwas custom designed to autonomously perform theenvironmental monitoring tasks required by thereef monitoring organisations [14].Toachievethese tasks, thevehiclemustnav- igateoverhighly unstructured surfaces at fixed altitudes(300-500mmabove thesea floor) andatdepthsinexcessof100m,incrosscurrentsof2knots andknowits position during linear transectstowithin 5% of totaldistance travelled.Itwas also consideredessentialthatthe vehicle be untethered to reduceriskofentanglement, theneed forsupport vessels andreducing drag imposedonthe vehicle operating in strongcurrents. Visual Motion Estimation 33 Fig. 1 shows the hybrid vehicle design named “Starbug” developed as part of this research. The vehicle can operate remotely or fully autonomously. Details of the vehicle performance and system integration are given in [9]. Fig.1. The“Starbug”AutonomousUnderwater Vehicle. 2.2Sensors The sensor platform developedfor theStarbug AUVand used in this research hasbeen basedonpastexperience with theCSIROautonomousairborne system [6]and enhanced to allowalow-cost navigationsuitefor thetaskof long-termautonomousreef monitoring [8]. The primary sensing component of theAUV is thestereocamerasystem.The AUVhas twostereoheads with onelooking downward to estimate altitude abovethe sea-floor andodometry, andthe otherlooking forward forobstacle avoidance(notusedinthis study). The cameras used are acolourCMOSsensorfrom Omnivisionwith12mm diameterscrew fit lenses whichhaveanominalfocal length of 6mm. Each stereo pair hasthe cameras setwithabaseline of 70mmwhichallows an effectivedistanceresolutioninthe range 0.2to1.7m. The cameras look through 6mmthickflat glass. The twocameras are tightlysynchronizedand line multiplexed into PALformat compositevideo signal.Fig.2showsthe stereo camera head used in theAUV andanrepresentativeimage of the typical terrainand visibilitythatsystem operates. In additiontothe vision sensors,the vehicle hasamagnetic compass, custom builtIMU (see [8]for details), pressure sensor (2.5mm resolution),a PC/104 800MHz Crusoe computer stackrunning theLinux OS,and aGPS which is used when surfaced. 3 Optimised Vision-Based Motion Estimation Due to the unique characteristics of the reef environment such as highly unstructured and feature rich terrain, relatively shallow waters and sufficient 34 M. Dunbabin, K. Usher, and P. Corke (a)Stereocamerapair (b)Typicalreef terrain Fig.2. Forwardlooking stereo camera system andrepresentativereefenvironment. natural lighting, vision is considered a viable alternative to typical expensive acoustic positioning and sonar sensors for navigation. The system uses reasonable quality CMOS cameras with low-quality miniature glass lenses. Therefore, it is important to have an accurate model of the cameras intrinsic parameters as well as good knowledge of the camera pair extrinsic parameters. Refraction due to the air-water-glass interface also requires consideration as discussed in [8]. In this investigation the cameras are calibrated using standard automatic calibration techniques (see e.g. Bouguet[3]) to combine the effects of radial lens distortion and refraction. In addition to assuming an appropriately calibrated stereo camera pair, it is also assumed that the AUV is initialised at a known start position and heading angle. The complete procedure for this odometry technique is outlined in Algorithm 1. The key components of this technique are image processing which we have termed three-way feature matching (steps 1-7) which utilises common well behaved procedures, and motion estimation (steps 8-10) which is the primary contribution of this paper. These components are discussed in the following sections. 3.1 Three-Way Feature Matching Feature extraction In this investigation, the Harris feature detector [5] has been implemented due to its speed and satisfactory results. Roberts[13] compared the temporal stability for outdoor applications and found the Harris operator to be superior to other feature extraction methods. Only features that are matched both in stereo (spatially) for height reconstruction, and temporally for motion reconstruction are considered for odometry estimation. Typically, this means that Algorithm 1 Visual motion estimation procedure. 1. Collect astereoimage. 2. Find all features in theentireimage. 3. Take the100 most dominant features as template(typicallythisnumberismore like10-50 features). 4. Match cornersbetween stereo imagesbycalculatingthe normalized crosscorrelation ( ZNCC). 5. Storestereomatched features. 6. Usingstereomatched features at currenttimestep, matchthese with stereo matchedfeaturesfromimagestaken at previoustimestepusing ZNCC. 7. Reconstructthose points whichhavebeen both spatially andtemporally matchedinto3D. 8. Usingthe dualsearchoptimisation technique outlined in Algorithm 2, determine thecameratransformation that best describesmotion from theprevioustothe currentimage. 9. Usingmeasuredworld heading, roll andpitch angles, transformthe differential camera motion to adifferential worldmotion. 10. Integratedifferential worldmotion to determine aworld camera displacement. 11. Go to step 1and repeat. between tenand fifty strongfeaturesare trackedateachsampletimeand during oceantrialswithpoorwater clarity this wasobservedtobelessthan ten. We are currently working on an improved robustnesstofeature extraction that consists of acombination of this higher framerateextractionmethodwith aslowerlooprunning amore computationally expensiveKLT (orsimilar) type trackertotrackfeaturesoveralonger time period.This will help to alleviate long term driftinintegrating differential motion. Stereo matching Stereo matching is used in this investigationtoestimatevehicle altitude,provide scaling fortemporal featuremotionand to generatecoarseterrainprofiles. Forstereomatching,the correspondences between features in theleftand rightimagesare found. The similarity between theregions surrounding each corner is computed (lefttoright)using thenormalised crosscorrelation similarity measure(ZNCC). To reducecomputation, epipolar constraintsare used to prunethe search spaceand only thestrongest cornersare evaluated. Once aset of matchesis found, theresultsare then refinedwithsub-pixel interpolation. Additionally, ratherthancorrectingthe entire image forlensdistortionand refraction effects, thecorrectionisapplied only to thecoordinate values of thetracked features,hence saving considerable computation. Visual Motion Estimation 35 36 M. Dunbabin, K. Usher, and P. Corke Optic flow (motion matching) The tracking of features temporally between image frames is similar to the spa- tial stereo matching as discussed above. Given the full set of corners extracted during stereo matching, similar techniques are used to find the corresponding corners from the previous image. Differential image motion ( du, dv) is then calculated in both the u and v directions on a per feature basis. To maintain suitable processing speeds, motion matching is currently con- strained by search space pruning, whereby feature matching is performed within a disc of specified radius. The reduction of this search space size can potentially be achieved with a motion prediction model to estimate where the features lie in the search space. In this motion estimation technique, temporal feature tracking currently only has a one frame memory. This reduces problems due to significant appearance change over time. However, as stated earlier, longer term tracking will improve integration drift problems. 3D feature reconstruction Using the stereo matched corners, standard stereo reconstruction methods are then used to estimate a feature’s three-dimensional position. In our previous vision-based motion estimation involving aerial vehicles [6], the stereo data was processed to find a consistent plane. The underlying assumption for stereo and motion estimation was the existence of a flat ground plane. In this current application, it cannot be assumed that the ground is flat. Hence, vehicle height estimation must be performed on a per feature basis. The primary purpose of 3D feature reconstruction in this investigation is for scaling feature disparity to enable visual odometry. 3.2 Motion Estimation The first step in the visual motion estimation process is to find a set of points (features) which give a three-way match, that is, those points which have both a stereo match in the current frame and a corresponding matching corner from the previous frame as discussed in Section 3.1. Given this correspondence, the problem is formulated as one of optimization to find at time k a vehicle rotation and translation vector ( x k ) which best explains the observed visual motion and stereo reconstruction as shown in Fig. 3. Fig. 3 shows the vehicle looking at a ground plane (not necessarily planar) at times k − 1 and k with the features as seen in the respective image planes shown for comparison. The basis behind this motion estimation is to optimise the differential rotation and translation pose vector ( d x est ) such that when used to transform the features from the current image plane to the previous image plane, minimises the median squared error between the predicted image displacement ( du  , dv  ) (as shown in the “reconstructed image plane”) and the Fig.3. Motion transformation from previoustocurrent image plane. actual image displacement(du, dv)providedfrom opticflow foreachthree-way matchedfeature. During theposevector optimisation,the Nelder-Meadsimplex method[11] is employedtoupdatethe pose vector estimate.This nonlinearoptimisation routine waschoseninthis analysisdue to itssolutionperformance andthe fact that it does notrequirethe derivativesofthe minimised function to be predetermined. The lack of gradientinformationallows this technique to be ‘model free’. Theposevector optimisation consists of atwo stage processateachtime step to best estimate vehicle motion.Since thedifferential rotations (roll, pitch,yaw)are knownfrom IMU measurements,the first optimisationroutine is restricted to only update thetranslationcomponentsofthe differential pose vector withthe differential rotations held constant at their measured values.This is aimedatkeepingthe solutionawayfrom localminima. As theremay be errors in theIMU measurements,asecond searchisconducted using theresultsfrom thefirstoptimisationtoseed thetranslationcomponent of theposeestimate, with theentireposevector nowupdated during the optimisation.This technique wasfound to provide more accurateresultsthan asingle searchstepasithelpsinavoidingspurious localminima. Algorithm 2describes theposeoptimisationfunction used in this analysisfor thefirst stage of themotionestimation. Note that in thesecond optimisation stage, theprocedureisidenticaltoAlgorithm2,however, dθ, dα and dψ are also updatedinStep3of theoptimisation. Visual Motion Estimation 37 38 M. Dunbabin, K. Usher, and P. Corke Algorithm 2 Pose optimisation function. 1. Seed search usingthe previoustimestep’sdifferential pose estimate such that d x =[dx dy dz dθ dα dψ] where dx, dy and dz arethe differential pose translationsbetween thetwo time frames with respecttothe currentcameraframe, and dθ, dα and dψ arethe differential roll, pitchand yawanglesrespectivelyobtained from theIMU. 2. Enteroptimisation loop. 3. Estimate thetransformation vector from theprevioustothe currentcamera frame. T = R x ( dθ) R y ( dα) R z ( dψ)[dx dy dz] T 4. For i =1 number of three-waymatched features,repeatsteps 5to9. 5. Displacethe observed 3D reconstructedfeature coordinates(x i , y i , z i )from currentframe to estimate whereitwas in thepreviousframe ( x e i , y e i , z e i ). [ x e i y e i z e i ] T = T [ x i y i z i ] T 6. Projectthe current3Dfeature points to theimage planetogive(u o i , v o i ). 7. Projectthe displaced feature(step 5) to theimage planetogive(u d i , v d i ). 8. Estimate theobservedfeature displacementonthe image plane. [ du  i dv  i ] T =[u o i v o i ] T − [ u d i v d i ] T 9. Computethe squarederrorbetween theestimated andactualfeature displacement ( du, dv)observedfromoptic flow. e i =(du i − du  i ) 2 +(dv i − dv  i ) 2 10. Usingthe median square errorvalue(e m )fromall three-waymatched features, update d x usingthe Nelder-Meadsimplex method. 11. If e m is less than apresetthreshold,end, else go to step 3and repeat usingthe updated d x . The resulting optimiseddifferential pose estimate at time k ( x k )whichis withrespect to thecameracoordinate system attached to theAUV canthen be transformed to aconsistentcoordinate system using theroll, pitch and yawdatafrom theIMU.Inthis investigation, ahomogeneoustransformation ( T H )ofthe camera motion is performed to determine thedifferential change in theworldcoordinate frame. Thedifferential motion vectors are then integrated over time to obtain the overallvehiclemotionpositionvector at time t f such that x t f = t f  k =0 T H k d x k (1) It wasobservedthatduring oceantrials, varying lighting and structure could degradethe motion estimation performance due to insufficientthree- waymatched features being extracted. Therefore,asimple constant velocity vehicle modeland motion limit filters(basedonmeasuredvehicleperformance limitations) were addedtoimprove motion estimation anddiscard obviously erroneous differential optimisation solutions.Amore detailed hydrodynamic modeliscurrently being evaluatedtofurther improvepredicted vehicle motion andaid in pruning thesearchspace andoptimisationseeding. 4ExperimentalResults The performance of thevisualmotionestimationtechnique describedinSec- tion 3was evaluatedinatest tank constructedatCSIRO’sQCATsiteand during oceantrials. The test tank hasaworking sectionof7.90x5.10m witha depthof1.10m.The floor is lined withasand coloured matting with pebbles, rocksofvarying sizes andlarge submerged3Dobjectstoprovideatexture andterrainsurface forthe vision system.Fig.4showsthe AUVinthe test tank andthe oceantestsiteoffPeel Island in Brisbane’s MoretonBay. (a)CSIROQCATtesttank (b)Ocean test site Fig.4. AUVduringvisual motion estimation experiments. In thetesttankthe vehicle’s vision-based odometry system wasground truthedusing twoverticalrodsattached to theAUV which protrudedfrom thewater’s surface. ASICK laserrange scanner(PLS) wasthenusedtotrack thesepointswithrespect to afixedcoordinate frame. By tracking thesetwo points,bothpositionand vehicle heading anglecan be resolved.Fig.5shows Visual Motion Estimation 39 40 M. Dunbabin, K. Usher, and P. Corke the results of the vehicle’s estimated position using only vision-based motion estimation fused with inertial information during a short survey transect in the test tank. The ground truth obtained by the laser tracking system is shown for comparison. −3 −2 −1 0 1 2 3 −6 −5.5 −5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 x (m) y (m) Groundtruth from PLS Vision position estimate Fig.5. Position estimation usingonlyvision andinertial information in shortsurvey transect.Also shownisaground truthobtained from thelasersystem. As seen in Fig. 5, themotionestimationcomparesverywell with the ground truthestimationwithamaximumerror of approximately2%atthe endofthe transect.Although, this performance is encouraging,work is being conducted to improvethe position estimation over greater transect distances. The ground truthsystem is notconsideredperfect (asseen by thenoisy position traceinFig.5)due to resolutionofthe laserscanner andthe size of therodsattached to thevehiclecausing slight geometricerrors.However,the system provides astable position estimate over time forevaluation purposes. Apreliminary evaluation of thesystem wasconductedduring oceantests over ahard coral androckreef in MoretonBay.The vehicle wasset off to perform an autonomous untethered transect using theproposed visual odometry technique.The vehicle wassurfaced at thestart andend of thetransect to obtain aGPS fix andprovide aground truthfor thevehicle. Fig. 6shows theresultsofa53m transect as measured by theGPS. In Fig. 6, thecirclesrepresent theGPS fix locations, andthe line shows thevehicles estimatedpositionduring thetransect.The resultsshowthatthe vehiclespositionwas estimatedtowithin 4m of theactualend GPSgiven location or to within8%ofthe totaldistancetravelled. Giventhe poor water clarity andhighwaveactionexperienced during theexperiment, theresults are extremely encouraging. [...]... approaches, stereo-based vision have been reported as the most promising approach to obstacle detection [7] The recent P Corke and S Sukkarieh (Eds.): Field and Service Robotics, STAR 25 , pp 43–54, 20 06 © Springer-Verlag Berlin Heidelberg 20 06 44 N Gheissari and N Barnes works in stereo-based obstacle detection for intelligent vehicles include the Inverse Perspective Method (IPM) [2] and the u- and v-disparity... Tech Rep 72/ 88/N488U, Plessey Research Roke Manor, December 1988 6 P Corke An inertial and visual sensing system for a small autonomous helicopter Journal of Robotic Systems, 21 (2) :43–51, February 20 04 7 P.I Corke, D Strelow, and S Singh Omnidirectional visual odometry for a planetary rover In Proceedings of IROS 20 04, pages 4007–40 12, 20 04 8 M Dunbabin, P Corke, and G Buskey Low-cost vision-based AUV... demonstrating real-time performance [2] P Corke and S Sukkarieh (Eds.): Field and Service Robotics, STAR 25 , pp 55–66, 20 06 © Springer-Verlag Berlin Heidelberg 20 06 56 N Barnes and G Loy Here detection took advantage of the circle that must appear on Australian speed signs The radial symmetry algorithm is a shape detector, based on image gradient, and so is robust to varying lighting conditions and occlusions,... pp 1 3-1 8, 20 04 8 Grubb Grant, Alexander Zelinsky, Lars Nilsson, Magnus Rible, 3D Vision Sensing for Improved Pedestrian Safety, Intelligent Vehicles Symposium (20 04), pp 1 9- 24 , Parma Italy, June 20 04 9 Labayrade, R., Aubert, D., and Tarel, J. -P., Real Time Obstacle Detection in Stereovision on Non Flat Road Geometry Through ”V-disparity” Representation, pp 64 6-6 51,June 20 02 10 Sun, Z., Bebis, G., and. .. docking Computer Vision and Image Understanding, 67(3) :22 3 23 8, September 1997 16 S van der Zwaan, A Bernardino, and J Santos-Victor Visual station keeping for floating robots in unstructured ennvironments Robotics and Autonomous Systems, 39:145–155, 20 02 17 L Whitcomb, D Yoerger, H Singh, and J Howland Advances in underwater robot vehicles for deep ocean exploration: Navigation, control and survey operations... Elliot Duff, Pavan Sikka, and John Whitham 42 M Dunbabin, K Usher, and P Corke References 1 H Adams, S Singh, and D Strelow An empirical comparison of methods for image-based motion estimation In Proceedings of the 20 02 IEEE/IRJ International Conference on Intelligent Robots and Systems, October 20 02 2 O Amidi An Autonomous Vision-Guided Helicopter PhD thesis, Dept of Electrical and Computer Engineering,... IG200 1 -2 01 1-0 3 is gratefully acknowledged References 1 G Loy and A Zelinsky, “Fast radial symmetry for detecting points of interest,” IEEE Trans Pattern Analysis and Machine Intelligence, vol 25 , no 8, pp 959– 973, Aug 20 03 2 N Barnes and A Zelinsky, “Real-time radial symmetry for speed sign detection,” in Proc IEEE Intelligent Vehicles Symposium, Parma, Italy, 20 04 66 N Barnes and G Loy 3 G Loy and. .. On-Road Vehicle Detection Using Optical Sensors: A Review, Proceedings of IEEE Intelligenct Transportation Systems Conference , Washington, D.C USA, pp 58 5-5 90 ,20 02 11 Viola, P., Jones, M., and Snow, D., Detecting Pedestrians Using Patterns of Motion and Appearance, Proceedings of the International Conference on Computer Vision (ICCV), pp 73 4-7 41, Oct 20 03 12 Zhao, L and Thorpe Charles E., Stereo- and. .. Fascioli A., Stereo Inverse Perspective Mapping: Theory and Applications,Image and Vision Computing Journal, 16(8),pp 58 5-5 90, 1998 3 Bertozzi, M., Broggi A., Fascioli A., and Sechi, M., Shape-based Pedestrian Detection, Proceedings of IEEE Intelligent Vehicles Symposium, pp 21 5 -2 20 , Oct 20 00 4 Bertozzi, M., Broggi, A., Grisleri, P., Graf, T., and Meinecke, M., Pedestrain Detection in Infrared Images,... Vision and Pattern Recognition, vol 1, 20 03, pp 750–755 15 D G Shaposhnikov, L N Podladchikova, A V Golovan, and N A Shevtsova, “A road sign recognition system based on dynamic visual model,” in Proc 15th Int Conf on Vision Interface, Calgary, Canada, 20 02 16 S.-H Hsu and C.-L Huang, “Road sign detection and recognition using matching pursuit method,” Image and Vision Computing, vol 19, pp 119– 129 , 20 01 . referred to [10]. Among these differen ta pproac hes, stereo-based vision ha ve be en reported as the mostpromising approachtoobstacle detection [7]. The recent P. Corke and S. Sukkarieh (Eds.): Field and Service Robotics, STAR 25 ,. obtainingaccuratemotioninformation. The useofvision P. Corke and S. Sukkarieh (Eds.): Field and Service Robotics, STAR 25 , pp. 31– 42, 20 06. © Springer-Verlag Berlin Heidelberg 20 06 32 M. Dunbabin, K. Usher, and P. Corke for. This overlap can help to prevent regions being split acrosstwo successivebins. In our experimentsweset g=8 pixels. This quan tisation approac hh as some adv an tages; first we apply our robust fitting metho dt oe ac hb in separately and hencew ea vo id exp ensiv e approaches

Field and Service Robotics - Corke P. and Sukkarieh S.(Eds) Part 2 pps

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan