ebook img

NASA Technical Reports Server (NTRS) 20020011670: The 3D Recognition, Generation, Fusion, Update and Refinement (RG4) Concept PDF

9 Pages·0.62 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 20020011670: The 3D Recognition, Generation, Fusion, Update and Refinement (RG4) Concept

i The 3D Recognition, Generation, Fusion, Update and Refinement (RG4) Concept. David A. Maluf, Peter Cheeseman, Vadim N. Smelyanskyi, Frank Kuehnel and Robin D. Morris. NASA Ames Research Center, Mail Stop 269, Moffet Field, CA 94035 [email protected] process, it is possible to have more than one foveation region. Abstract This research initiative is directed This paper describes an active (real towards descent imagery in connection time) recognition strategy whereby with NASA's Entry Descent Landing information is inferred iteratively across (EDL) applications. 3-D Model several viewpoints in descent imagery. Recognition, Generation, Fusion, We will show how we use inverse theory Update and Refinement (RGFUR or within the context of parametric model RG4) for height and the spectral reflection characteristics are in focus for generation, namely height and spectral reflection functions, to generate s model various reasons, one of which is the assertions. Using this strategy in an prospect that their interpretation will active context implies that, from every provide for real time active vision for viewpoint, the proposed system must automated EDL. refine its hypotheses taking into account the image and the effect of uncertainties as well. The proposed system employs 1 Introduction and Background probabilistic solutions to the problem of iteratively merging information (images) The period of the Entry, Descent and from several viewpoints. This involves Landing is the missions most critical feeding the posterior distribution from all period with the highest risk factor for a previous images as a prior for the next potential Loss of Vehicle (LOV). Since view. Novel approaches will be distant missions such as Mars are developed to accelerate the inversion constrained in payload and design, search using novel statistic NASA must employ technology to implementations and reducing the intelligently use all available resources, model complexity using foveated vision. optimally integrate sensor data and perform real-time decision and reason Foveated vision refers to imagery where for successful Entry, Descent and the resolution varies across the image. Landing. In this paper, we allow the model to be foveated where the highest resolution Understanding the importance of Entry region is called the foveation region. Descent and Landing is best illustrated Typically, the images will have dynamic by describing the critical phases of an control of the location of the foveation Entry, Descent and Landing process for region. For descent imagery in the a spacecraft. It is estimated that the Entry, Descent and Landing (EDL) spacecraft's descent from the time it hits w- the upper atmosphere until it lands About 70 to 100 seconds before landing takes no more than 4 minutes and a few a landing radar will be activated. To this seconds to accomplish the final landing end, we anticipate to having our as in the case of the Mars Polar Lander. proposed 3-D Model Recognition, Enabling technologies such as active Generation, Fusion, Update and vision can continually operate and Refinement (RGFUR or RG4) to include integrate the vision system to actively radar readings and other sensor interpret images for enhanced model modalities (gyros and inertia guidance). recognition which can play a crucial role The radar will be able to gauge the in mitigating major risk factors. spacecraft's altitude about 40 seconds after it is turned on, at an altitude of We estimate that the period where on- about 1.5 miles above the surface. With board intelligent systems can start a robust RG4 system, the spacecraft capturing the landing site's topographic can rely on the on-board camera for details starts about two minutes before final touch down. landing and the spacecraft is expected to be moving at about 1,000 miles per hour around 5 miles above the surface. Entry, descent and land;ng and is an advance over, the work in [10] in a number of ways. In this paper we argue for a unified model of the surface of interest, with all observations aimed 2 Similar Work and Comparison at building up knowledge of this model, in contrast to an approach that builds up a model piecewise and in a manner Johnson's work described in [10] dependent on the detection of features addresses the problem of autonomous in the images. We also propose doing operation close to a small body. The absolute location relative to the entire work described in our paper differs from, surface model, an approach that is much more robust and accurate than We are investigating a Bayesian model- location relative to a small number of based approach to integrating landmarks. It also does not rely on the information from multiple images of the presence of explicit landmarks on the same area into a unified model at a object, but instead uses the entire resolution higher than that of the surface essentially as one, extended contributing images (super-resolution). landmark. Finally, the approach we This model is a representation of the advocate gives explicit uncertainty physical parameters describing the estimates of the surface and position; surface. The physical parameters we the work in [10] provides uncertainty use are heights at each grid point and estimates by running Monte Carlo the surface reflectance properties at simulations. After all, a typical risk each grid point, such as albedo (for a associated with the landing process is to Lambertian reflectance model) or more be able to resolve the surface to the generally a parameterized bi-directional level of details and be capable of reflectance distribution function (BRDF). avoiding a boulder, a ditch or a crack Each image is an independent sample which could result in a Loss of Vehicle of the area of interest, and by combining (LOV). the information from these separate images, surface features smaller than the image pixel scale can be captured. 3 Research Objectives Because the model is constructed at finer resolution than any image, it is possible to use it to accurately project The ambition of this paper in active what that surface would look like from vision is to continually operate and any view point, under any lighting integrate a vision system that can conditions. This projection is computed actively interpret images for enhanced by summing the contribution from each model recognition. The proposed surface patch onto each synthesized approach exploits super-resolution image pixel, weighted by the camera techniques [3][4] and focus of attention point spread function (PSF). This (foveated vision) to enable better model projection process is called rendering in recognition in descent imagery. computer graphics, and the realism achieved by current computer graphics This research initiative is directed indicates the viability of accurate image towards descent imagery in connection projection from a surface model. with NASA's Entry Descent Landing (EDL) applications. 3-D Model The essence of super-resolution in RG4 Recognition, Generation, Fusion, is to use Bayesian inference to invert Update and Refinement (RGFUR or the image rendering process. That is, in RG4) for height and the spectral reflection characteristics are in focus for rendering, the surface and its reflectance properties are assumed various reasons, one of which is the known, as is the location and properties prospect that their interpretation will of the camera and the lighting source provide for real time active vision for automated EDL. (typically the sun), and this information is used to generate an image under those conditions. In the Bayesian model-based inference process, the 4 Model Recognition, Generation, rendering process is reversed. That is, Fusion, Update and Refinement given the images, we find the most likely (RG4) and Super-Resolution surface that would have generated them. bottom. With two images only (similar to The model would consist of a the one on the left), the right image discretized grid covering the area of contains more detailed features (The interest, where each grid point stores right image is inferred from 3D model). the geophysical parameters of the correspondinggroundlocation. These NASA has developed this process of parametersmainlyincludeelevationand model-based inversion over the last few reflectance spectral characteristics. years, starting from the simple 2-D This modelis chosenso thatwhatthe models, and working up to the full 3-D camera is expected to see can be surface reconstruction problem [3][4]. projectedfromthemodel. Modelupdate We are now able to super-resolve the consists of comparing the expected heights and albedos of the true surface pixel values with the observed, and from multiple images, where the images changing the model to better fit the data can be taken from any viewpoint and (including previous data). This update under any lighting conditions. On will be accomplished by computationally artificial images generated from the efficient Bayesian inference that inverts model, we are able to reconstruct the the image rendering process as used in surface to essentially the noise level of computer graphics. The search for the the data. most likely surface will be performed by a novel type of gradient descent, where 4.1 Research the gradient is computed analytically. Super-resolution is a very useful product for the Entry, Descent, and Landing process where the resolved model is beyond what can be extracted using the best available image. The main reason for developing the super-resolution capability is to allow the integration of information from different images without the problem of aliasing and mismatched pixel grids. Super- resolution solves this problem because any pixel maps onto many ground points, so that intensity of any pixel can be accurately computed by summing up the corresponding ground points. In fact, the surface model becomes the repository of the pixers information, so that a system does not need to have multiple images persistent in its memory, but rather a model. EDL processes and post processes will thus interact with the surface model, and can view it from any direction or under any lighting conditions, including viewpoints that were not originally available! Figure 1 Top image is one of the two In implementing this research, we images taken from Clementine imagery extend 3-D super-resolution algorithms to super-resolve the image on the to solve a number of technical problems that arise in this application. In right image contains crisp detailed particularwewill findworkablesolutions features. The bottom plot is the surface to the following problems using the inferred from the images (not shown is approachoutlinedbelow. the albedo field). Shape from Motion A main objective of RG4 is to achieve a surface inference in "real time". To that extend we obtain a fast shape from motion alogrithm which can feed itself as prior knowledge. Standard "shape from motion" algorithms [1] maintain the assumption of constant surface reflectance properties and are not extendable in nature to super-resolution. We plan to use our new shape-from- motion technique to "bootstrap" a super-resolution inference for natural surface formation where varying albedo properties and shadows are correctly accounted for. Mufti-Spectral Integration EDL on-board instruments have multiple spectral bands will have different coverages, i..e. different widths and ranges. Our approach to solving the problem posed by integrating this heterogeneous information is to consider the model's surface by a wavelength dependent reflectance. That is, instead of a single number to represent the (Lambertian) surface •. ....'.....,...-..: reflectance for a particular band, we will represent the reflectance as a "smooth" function of wavelength, where the function is represented by a small | number of coefficients that are { estimated from the data. This function can then be integrated with each band spectral response function (a property of the instrument) to get the expected reflectance for that band. 30 20 40 30 20_ 40 x_JOm_$ Inf_'r_l M_ _,h _2trnaqes Super-Resolution Figure 2: Top image is one of the twelve synthetic images of Silicon Valley area One of the major achievements in this used to super-resolve the second research is the method to achieve a image. With twelve images only, the recursive linear minimization as part of the desired inference for three- in the resolution of pixel/model dimensional surface reconstructionto relationships. the extentthatthe resolutionof inferred surfacemeshis higherthanthe spatial resolution of input images. This technique also allows images to be super-resolved in both two or three dimensions(accordingto the natureof thedata). Accelerated Search In statistical inference scheme, the solution for the gradient step in linear minimization for large sparse linear systems for which direct methods such Figure 3: Left image is our planned as Conjugate Gradient is expensive in 'spider-web' type mesh with a foveated terms of both time and storage cost. For center (not necessary centered in the the class of descent imagery problem of middle). Right image is a typical non- using Bayesian inference for 3-D model uniform grids. parameter estimation, we plan to use a novel iterative technique which solves We extend 3-D surface models to the problem of search minimization foveated models using traditional efficiently in terms of storage and triangulated surface, but distribution of memory cost. This novel technique the heights would no longer be tied to a takes root in a recent discovery for a uniform grid but to Foveated model model prior which reduces the (Figure 3.a). This extension is not covariance matrix complexity from a difficult in principle, but the changed quadratic to a linear representation. As representation affects triangle indexing, a result, the amount data will be and so affects efficiency. relatively linear to the size of the model which is essential especially in a scarce computing environment. 4.2 Active Recognition: Concepts and Technical Aspects Foveated Vision The key idea behind active recognition We also support a Foveated Vision in a sequential recognition strategy is capability with variable resolution--that that of improving interpretation by is, the surface triangles may be very accumulating evidence in real time. The small in some areas (super-resolved) important aspect in the Entry Descent and very coarse in other areas (under- and Landing recognition problem is to resolved). The primary value of foveated compute on-line a 3D model from vision is in the model reconstruction sensory data linked to the different where high resolution information is sensor hardware which support the transmitted in the regions of the image different phases in descent process that are selected as important. On the (e.g. different cameras, FOV, RADAR, other hand, low resolution information is LADAR, altimeters, gyros etc.). processed at a second stage under contraints (e.g. time and computing It is clearly understood that the image resources). Foveated vision is crutial in resolution in the early stage does not descent imagery and will enable control guarantee enough information either for quantitative or for qualitative model reconstruct the height field h(r) without recognition,Butacquiringuncertainties the knowledge of the boundary serves to condition prior expectations conditions, which are directly obtained about the model and establishes a by the other sensor modalities and; in quantitative representation. particular, the radar readings at a later stage. With single radar readings (initial Practically, a meaningful qualitative condition), the height field h(r) is readily recognition for a 3D-model reconstructed. reconstruction can be achieved after only a few sequences of images have 5 Recursive Super-resolution been collected. To achieve a quantitative recognition the 3D model Our current and existing super- recognition is optimally obtained by resolution system can address many computing the probability of the 3D problems: the images may be of model given the image sequences or differing resolutions (e.g. multiple concurent cameras); the surface albedo is not assumed constant; the density of the model is user and data driven. Where hi and p_ are the parameters of a height field and an albedo field. The model that we are trying to infer is defined to be the topology and At different stages along the descent reflectance properties of the surface process, image sequences with small being observed. For simplicity we frame-to-frame camera motion can be define the surface over a grid of points, treated actively to provide an early 3D and currently define a height value, hi model. This real time behavior and an albedo value, p_ at each grid leverages from small motions, which point. Bayes' theorem then states that minimizes the correspondence problem to infer values for the heights and between successive images and the albedos from the image data, we use knowledge of the camera trajectory. the expression However, this sacrifices depth resolution of the small baseline between p(h, p l[_...I.) = p(l_.../. Ih,p)p(h,p), consecutive image pairs [9]. Solution to this problem is trivially sought through a which states that the posterior probabilistic incremental integration (e.g. distribution of the heights and albedos is Kalman Filter). In this particular active proportional to the likelihood - the recognition, we will employ a matching probability of observing the image data, and extraction technique which takes l, given the current values of the advantage of the lateral motion of the heights and albedos - multiplied by the camera and transforms the search prior distribution over the model. problem to a one-dimensional search problem (search is limited to foveated To the extent of super-resolution, we region). make the assumption that the likelihood is due to zero mean Gaussian errors Shape from shading-derived techniques between the observed images, I, and provides gradient vector fields of the the images synthesized from the model, surface 7h(r) and can be readily obtained in "real-time" from a single image source under very simplifying p(l,...I, Ih,p)= II Exp[-½( !_-i(h'p)')2]: assumptions. Our approach is to ,_(h,p), resulting in the likelihood being For an Entry and Descent real-time process, strong prior about the surface model is highly desirable and therefore we plan to extend the super-resolution where the product is taken over all technique to include the shading pixels in all the images in the data set. information Vh,. Bayes' theorem then The prior used is based on penalizing states that to infer values for the heights the curvature of the surface. It is a and albedos from the image data as well penalty encouraging a continuity in the from the slopes, we use the expression inferred surface. p(h.p IIt...l. ) = p(l,.../, Ih,p)p(h,p) p(Vh, - Vh )p(h, - h) Because the likelihood is a function of the images synthesized from the model, Here, it shall be remarked that hs and it is clearly a non-linear function of the Vh, are independent prior information heights and albedos, and this makes obtained separately (i.e. shape from optimizing the posterior distribution motion and image to surface gradient difficult. However, we have found that mappings). Furthermore, hs will be an optimal solution to the nonlinear function can be obtained by a novel obtained directly from a fast shape from Conjugate Gradient (CG) search. motion method. Using the form of the prior in the previous equation makes it ^ feasible to account for uncertainties in We expand l(h,p) about the current the independent measurements of hs estimate, ho,P0, and replace it by and Vh,. In addition, we plan to use the prior hs to integrate the radar and other altimeter readings whenever they become available. We therefore "bootstrap'" the inference of the actual where D is the matrix of derivatives height field and albedos. Potentially, this evaluated at h0,P0 leaves us with the advantage of rewriting the Bayesian inference process on the deviation (fluctuation) onpixeli between the prior and height field rather D_.j = oqaeight(or albedo)j than the height field itself, thus the parameters will be The minimization of the log-posterior then becomes the minimization of a Sh = h- hs, quadratic form, and can be performed using the conjugate gradient method• and are believed to be small, such that a This minimization finds the minimum of fast convergence of the inference the local linear approximation. At the process can be guaranteed. minimum, we recompute ,/(h,p) and D and minimize the log-posterior iteratively. 6 Final Remarks An operational software system based 5.1 The RG4 system: embedding on this proposed demonstration system stronger prior would use images to update the surface model as soon as they are received. The Bayesian approach gives a solution to the problemof how muchthe prior principles and practice. Second modelshouldbebelievedwhenthenew Edition, Addison-Wesley, 1990 datadisagreeswiththe priormodel.Not 6. M. Armstrong, A. Zisserman and R. onlydoesthis allowmodelupdatewhen Hartley. "Self-Calibration from image thereis conflictinginformation,butitcan triplets". In Proceedings of the also serve as a change detection European Conference on Computer warning system. This is possible Vision, pp 3-16, 1996 because the model projects expected 7. Q. Zheng and R. Chellappa. values. If measurements are many Estimation of Illuminant Direction, standard deviations from expectations, Albedo and Shape from Shading. then it is a signal for likely change. IEEE Transactions on Pattern Analysis and Machine Intelligence, Another planned operation of the RG4 Vo113, No 7, July 1991. system is to use the constructed model 8. L. Matthies, T. Kanade, R. Szeliski. as a topographical map after the landing Kalman Filter-based Algorithms for phase. The super-resolved model can Estimating Depth from Image be employed to focus the desired Sequences, International Journal of exploration phase of the mission. Computer Vision, Vol. 3, pp. 209- Models constructed from altitudes will 236, 1986. provide a much wider scope in the 9. Y. Ohta, T. Kanade, Stereo by Intra- landing site topography. and Inter- Scanline search Using Dynamic Programming, IEEE Trans. PAMI, Vol. 7 pp. 139-154. 10. A.E. Johnson, Y. Cheng and L.H. References Matthies, Machine Vision for Autonomous Small Body Navigation, 1. B. Horn and M. Brooks. Shape from tn Proceedings of IEEE Aerospace Shading. MIT Press, 1989 2000 Conference. 2. Z. Zhang, R. Deriche, O. Faugeras and Q.T. Luong. A robust technique for matching two uncalibrated images through the recovery of the epipolar geometry. Technical report No 2273, INRIA, Sophia Antipolis, 1994 3. Morris, R. D., Cheeseman, P., Smelyanskiy, V. N., Maluf, D. A., A Bayesian Approach to High Resolution 3D Surface Reconstruction from Multiple Images,Proceedings of IEEE Signal Processing workshop on Higher- Order Statistics, June, 1999, pp. 140-143. 4. Smelyanskiy, V. N., Cheeseman, P., Maluf, D. A., Morris, R. D., Bayesian Super-Resolved Surface Reconstruction, Computer Vision and Pattern Recognition, 1999. 5. J.D. Foley, A van Dam, S.K. Feiner and J.F. Huges, Computer Graphics,

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.