![]() METHOD FOR DETECTING POINTS OF INTEREST IN A DIGITAL IMAGE
专利摘要:
A camera (10) produces a sequence of images (12) processed by an algorithm (14) for finding points of interest that can be parameterized by a detection threshold (r) such that the number (N) of points of interest detected in the picture varies depending on the level of the threshold. The characteristic giving the number (N) of points of interest detected as a function of the threshold (r) is modeled by an exponential function decreasing in square root dynamically parameterizable by values related to the image to be analyzed. The method comprises steps of: a) determining (18) the parameter values of the decreasing exponential function for the current image; b) prediction (18), for this current image, of an optimum value of the threshold by using the modeled characteristic, parameterized with the values determined in step a); and c) applying (14), for at least one subsequent image, the point of interest search algorithm with the optimal threshold value (T) calculated in step b). 公开号:FR3019924A1 申请号:FR1453161 申请日:2014-04-09 公开日:2015-10-16 发明作者:Gaspard Florantz 申请人:Parrot SA; IPC主号:
专利说明:
[0001] The invention relates to the processing of digital images, and more precisely the detection in such an image of "points of interest" or "corners", in particular for processing intended for robotics and computer vision systems. [0002] The "corners", "points of interest" or "landmarks" are small-sized elements of an image that have invariance characteristics allowing them to be found from one image to the next. The analysis of the displacements of the points of interest in the image is used by various algorithms of shape recognition, analysis of the displacements of the camera, reconstruction of a two-dimensional or three-dimensional space, etc. The basic idea is that a point of interest detected on an image often remains a point of interest on the following images, so that the analysis of the dynamic variations of the image is thus reduced to the analysis of the images. changes in the position of points of interest, that is, an analysis of an evolving list of points of interest to extract relevant descriptors. A satisfactory detection of the points of interest in the image is therefore an essential prerequisite for any processing of this type, the detection algorithm used to be equally robust and resource-efficient if it is to be implemented "to the fly "by an embedded system, with recognition of points of interest and extraction of descriptors in real time or near real. Various algorithms for detecting points of interest have been proposed, among which we can notably mention the algorithms FAST (Features from Accelerated Segment Test), SURF (Speeded-Up Robust Features) or SIFT (Scale-Invariant Feature Transform). The FAST algorithm is described in particular by: [1] E. Rosten, R. Porter, and T. Drummond, "Faster and Better: A Machine Learning Approach to Corner Detection", IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 32, pp. An improved version of such a FAST algorithm is described in WO 2013/055448 A1, which discloses a method for efficiently detecting and classifying points of interest in a digital image. [0003] For the SIFT algorithm, one can for example refer to: [2] D. G. Lowe, "Distinctive Image Features from Scale-Invariant Key-points", Int. J. Comput. Vision, Vol. 60, No. 2, pp. 91-110, Nov. 2004. And for the SURF Algorithm, at: [3] H. Bay, A. Ess, T. Tuytelaars, and LJV Gool, "Speeded-up Ro- bust Features (SURF)", Computer Vision and Image Understanding, Vol. 110, No. 3, pp. 346-359, 2008. These algorithms are all parametrizable by a data called "detection threshold" which conditions the detection sensitivity of the points of interest. Essentially, the current image is scanned pixel by pixel and the algorithm determines whether, for each pixel analyzed of the image, a score based on predefined criteria is or not greater than the detection threshold parameterized input. Depending on whether the score is above or below this threshold, it is considered whether or not there is a point of interest in the image. Therefore, with a low detection threshold, the algorithm provides a high number of points of interest - but with the risk of delivering many unnecessary points of interest, which will not be found on the following images and will be therefore penalizing in terms of computing power used. Conversely, a high detection threshold makes it possible to eliminate many irrelevant points of interest that do not have the desired characteristic of invariance from one image to the next, but at the risk of not detecting certain useful points of interest. for example, in areas of the image with insufficient contrast, too much texture, etc. Insufficient points of interest can also have a negative impact on the results of downstream treatments. The object of the invention is to improve the performance of these points of interest detection algorithms by dynamically modifying their parameterization, but without changing the internal operation thereof. The invention will thus be applicable to a very large number of points of interest detection algorithms, particularly to a FAST-type algorithm for which it will provide particularly significant advantages. But this application to a FAST algorithm is in no way limitative, the invention being applicable to other types of algorithms as long as they can be parameterized with an adjustable detection threshold, as in the case in particular of SURF and SIFT algorithms. The basic idea of the invention is to operate, prior to the detection of points of interest itself, a preliminary analysis of the current image and to determine dynamically, according to the result of this analysis, the value of the detection threshold that will be used for the detection of the points of interest of the next image. In other words, the detection threshold will no longer be a frozen parameter, set empirically or possibly modified by a feedback loop, but an input data of the detection algorithm dynamically modifying each image of in order to better adapt the response of the detection algorithm according to the rapid variations of the image of the scene captured by the camera. The technique of the invention aims in particular to maintain the number of points of interest detected by the algorithm at a substantially constant level, chosen a priori as optimal, and this even in the presence of a very changing scene. A typical example of a changing scene is that captured by an on-board camera, which delivers video sequences with very large and very rapid variations in the image content as well as in its brightness and contrast. depending on the presence or absence of roadside buildings, the appearance of these buildings, passage in tunnels or under bridges, glare from an oncoming vehicle, etc. In this respect, the invention seeks to solve two specific problems encountered with most point of interest detectors: the first problem is that of temporal instability: with a heterogeneous video sequence, the algorithm detects points of interest. interest in a very variable number from one image to another, depending on the characteristics of contrast, texture, illumination, etc. images. Concretely, and according to the images of the sequence, one can end up with a high number, even excessive points of interest, which will have been very penalizing in terms of computing resources used for the analysis of this image, or conversely with too few points of interest, leading to performance degradation for the processing operations downstream of the point-of-interest detector; the second problem is that of spatial instability: even if, overall, an image has a number of points of interest detected that are sufficiently high and substantially constant, the detections can be distributed very unequally, with very few points of interest in some areas of the image and many in others, which will have the same negative impacts as before with respect to the performance of downstream processing. [0004] One solution is to use a fixed detection threshold, low enough to generate an acceptable number of points of interest in all circumstances, especially in low contrast scenes. If the number of detections is excessive, it then operates a filtering of supernumerary points of interest according to their degree of relevance. [0005] This solution, however, involves a high cost in terms of computing resources, due to the increased processing time, the FAST algorithm being particularly affected by this phenomenon in case of choice of a too low threshold. Moreover, although this solution allows the output data to be stabilized in terms of the number of points of interest, it is not, fundamentally, an adaptive method. Another way of proceeding is to vary the threshold by providing a simple proportional feedback loop between the current detections and the desired result, as described in particular by: [4] A. Huang, N. Roy, A. Bachrach, P. Henry, M. Krainin, D. Maturana, and D. Fox, "Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera," Proc. of the International Symposium of Robotics Research (ISRR), 2011. Although this solution has a character of adaptability to the content of the image, it presents two serious limitations: - First, it does not make it possible to overcome sudden variations of the image, neither in terms of the content of the scene nor in terms of overall brightness (for example when entering or exiting a tunnel). In addition, automatic camera exposure correction systems tend to aggravate this problem, since these systems assume a linear sensor response and are also based on average image statistics, which are not do not reflect the reality of the variations of the scene; - Secondly, to operate the feedback, these systems presuppose a linearity between the detection threshold and the number of points of interest detected by the algorithm, which is a simplifying assumption not leading, in practice, to truly effective results. satisfactory with respect to heterogeneous image sequences. The invention proposes a new technique for detecting points of interest making it possible to overcome the above disadvantages by dynamically regulating the detection threshold of the algorithm dynamically, accurately and efficiently. The invention in particular provides the following advantages: - regulation of the calculation time of the detection, which remains substantially constant for an image, even in the presence of very heterogeneous image sequences; reduction and stabilization of the average calculation time of the algorithm downstream of the point of interest detector (algorithm for analyzing camera displacement, mapping, etc.), insofar as this algorithm will process a substantially constant points of interest in entry; - automatic adaptation to the complexity of the scene, allowing in particular to keep a sufficient number of points of interest detected when the scene is less textured (the typical example being the crossing of a tunnel by car); - increase in the rate of relevant points of interest, that is to say points of interest that can be found from one image to the next: in fact, the points of interest detected by the algorithm according to the on average are significantly better from this point of view, so that subsequent calculation operations can be performed more efficiently and faster; - excellent distribution of the points of interest in the different areas of the image, also leading to a better efficiency of the algorithms downstream of the detection; - simplicity of adjusting the desired number of optimal points of interest, allowing easy porting of the algorithm from one platform to another and adaptation for different versions (eg professional version and consumer version) . The starting point of the invention is the observation that, for any image and whatever the scale at which this image is considered (fine or coarse), the characteristic giving the number N of points of interest detected for a given detection threshold r (reduced to a homogeneous quantity at the pixel) is well represented by a decreasing exponential function, notably a decreasing exponential function in square root of the type: N (T) = C exp where Ce a are parameters related to the current image. Since such a function requires only the determination of two parameters (C and (3), such a model will be very fast to compute, because it will be sufficient for two iterations of the detection algorithm, for two threshold values z - different, with the advantage that the second iteration can be applied to only the points of interest selected by the first iteration, so even more quickly.Once the parameters of the exponential model N (r) thus determined, this model can be used in the following image to determine the threshold r to be applied to the detection algorithm, as a function of a number N of points of interest fixed a priori and considered optimal. , which consists in choosing a priori a number of points of interest that the algorithm must deliver and in determining, according to this number, the detection threshold to be applied, is the opposite of the conventional approach, which c We have to set the detection threshold a priori (possibly adjusted by a feedback loop), and to collect at the output a number N of points of interest, a number that can vary in important proportions if the sequence of images is particularly heterogeneous. [0006] The technique of the invention may be implemented in various ways which will be described in more detail below, in particular making it possible to reduce, in very large proportions, the phenomena mentioned above of temporal instability and instability. spatial detection of points of interest. It will also be seen that these particular implementations can be operated without substantially increasing the computing resources and the processing time, so almost "free" in terms of the cost of treatment. [0007] More specifically, the invention proposes a method for detecting points of interest in a digital image of a sequence of images of a scene captured by a camera. This method implements, in a manner known per se, an algorithm for finding points of interest that can be parameterized by a detection threshold such that the number of points of interest detected in the image varies as a function of the level of the threshold. According to the invention, this method provides for modeling by a decreasing exponential function of the characteristic giving the number of points of interest detected as a function of the threshold, this exponential function being dynamically parameterizable by values linked to an image to be analyzed. The method comprises steps of: a) determining, for a current image, parameter values of the decreasing exponential function; b) predicting, for said current image, an optimum value of the detection threshold by using the modeled characteristic, parameterized with the values determined in step a); and c) applying, for at least one image according to said current image, the points of interest search algorithm with the optimal detection threshold value calculated in step b). The optimum detection threshold value predicted in step b) may in particular be a value corresponding to a given number of points of interest, as indicated by said characteristic giving the number of points of interest detected. depending on the threshold. Very advantageously, the decreasing exponential function is an exponential function decreasing in square root: eXp V} r being the value of the detection threshold, reduced to a homogeneous quantity at the pixel, where N is the number of points of interest detected for a threshold r , and C and 0-being said parameter values related to the current image. In a particular embodiment of the invention, the step a) of determining, for a current image, the parameterization values of the exponential function comprises the following substeps: interest in the current image by the search algorithm with a first predetermined value of the detection threshold, resulting in a first number of points of interest; a2) at least one second search for points of interest in the current image by the search algorithm with a second value of the detection threshold greater than the first predetermined value of the detection threshold, resulting in a second number of points of interest; and a3) determining the parameter values C and o- from the numbers of points of interest obtained in steps a1) and a2). The method may advantageously comprise, during the iterative execution of steps a) and b): the application of self-exposure information of the camera as complementary input data for the prediction of the detection threshold . According to another particularly advantageous aspect of the invention, the method further comprises: the division of the current image into a plurality of sub-images of reduced size, the execution of steps a) and b) independently for the different sub-images of the current image, resulting in an optimum detection threshold value specific to each sub-image; and the execution of step c) with the application, for each sub-image of the following image, of the points of interest search algorithm with the optimum value of the respective detection threshold specific to this sub-image. picture. The optimum value of the detection threshold may in particular be a value corresponding to the same predetermined number of points of interest for all the sub-images. In a particular form of implementation of this second aspect of the invention, the method comprises: - prior to step a), the production of a pyramid-type multi-resolution representation of images, modeling the current image of the scene captured at different successively increasing resolutions; and the iterative execution of steps a) and b) for each level of the multiresolution repre- sentation starting with the level of least resolution, the detection threshold value determined for a given level being applied as data of the resolution level. input for the prediction of the detection threshold at the higher resolution level. According to various other advantageous aspects of this second aspect: the execution of step c) comprises the application, for each level of the multiresolution representation of the following image, of the algorithm for finding points of interest with the optimum value of the respective detection threshold specific to this level; the iterative execution of steps a) and b) comprises, for each level of the multiresolution representation, the application, as complementary input data for the prediction of the detection threshold at the higher resolution level, of the optimal thresholds. and applied thresholds corresponding to the lower resolution levels and / or self-exposure information of the camera; the method further comprises: dividing the images at the different levels of the image pyramid into a plurality of sub-images of reduced size; performing steps a) and b) independently for the different subimages, resulting in an optimal detection threshold value specific to each subimage; and executing step c) applying, for each sub-image of the next image, the point-of-interest search algorithm with the optimum value of the respective detection threshold specific to that sub-image. -picture. The optimum value of the detection threshold may in particular be a value corresponding to the same predetermined number of points of interest for all the sub-images. An embodiment of the invention will now be described with reference to the accompanying drawings in which the same references denote identical or functionally similar elements to each other. Figure 1 schematically illustrates the context of the detection of points of interest and the sequence of different treatments. Figure 2 shows a pixel with its neighborhood, to illustrate a detection technique points of interest. Figure 3 is a survey giving the number of points of interest detected for an example of an image sequence, in the case of a conventional technique and with the technique according to the invention. Figure 4 illustrates the descending exponential model used according to the invention for connecting the number of points of interest detected at the detection threshold of the algorithm. Figure 5 is an illustration of the pyramid-type multiresolution representation of images, for the same current image considered at different resolutions. Figure 6 illustrates, in block diagram form, the sequence of the different steps of the detection of the points of interest with a multiresolution approach. [0008] Figure 7 illustrates the characteristic giving, at different scales of resolution, the number of points of interest as a function of the detection threshold, for both the theoretical model and the real data. FIG. 8 illustrates the variations of the number of points of interest, for several resolution levels, respectively for a conventional technique, for a technique according to the invention with only temporal prediction and for a technique according to the invention with prediction at both temporal and spatial. FIG. 9 illustrates, for different levels of resolution of the image, the variations in the number of points of interest detected respectively for a conventional technique with a constant threshold and for a technique according to the invention, as well as the corresponding observed variations in the detection threshold in the latter case. Figure 10 illustrates an improvement according to a bucketing technique of dividing the image into a plurality of zones to which the algorithm of the invention is applied independently for each zone. In FIG. 1, reference numeral 10 denotes a camera capturing a scene and delivering a sequence of digital images 12 at times t-1, t + 1. Each of these images is subjected to a processing by a point of interest detector algorithm 14, which receives the pixels of the current image as input and outputs identification data of points of interest, these points of interest being N in number. N points of interest delivered at the output by the detection algorithm 14 are inputted to an analysis algorithm 16, which will study the displacements of the points of interest (tracking) on the successive images so as to produce data. location (from the displacement of the camera) and construction of a map of these displacements, stored in memory, for example to reconstruct a three-dimensional mesh of the traversed space, add augmented reality elements, etc. . These algorithms 16, which are typically Simultaneous Localization and Mapping (SLAM) type algorithms, do not form part of the invention, which only concerns the prior detection of the points of interest (block 14). The algorithm 14 is parameterized by a detection threshold r, which makes it possible to adjust the sensitivity of the algorithm. It is a parameter present in a large number of point of interest detection algorithms. [0009] Taking the FAST algorithm as an example (and not limiting), and with reference to FIG. 2, this operates, for each pixel x of the image, an analysis of the pixels {x1 .r located approximately on a circle surrounding the pixel x considered. The following decision function is used, for each pixel position x analyzed: Sp = d, I (x <I (p) - (p) + S, (p) - T <I (.V <b. (p) + T <4.0 The detection threshold r makes it possible to adjust the discrimination threshold to determine whether the brightness 1 (x1) of the pixel x, of the considered neighborhood is significantly higher (b), lower (cl) or similar (s) with respect to the luminosity I (p) of the central pixel considered x. It is considered that we are in the presence of a point of interest if a contiguous number m M (for example M = 9) of pixels of the set lx 1. - - - Xl (i) are of type d or b As will be easily understood, with such a FAST algorithm (and, likewise, with other detection algorithms such as SURF, SIFT. ..), for a given threshold 1- we obtain a greater number N of points of interest with a high contrast image than with a homogeneous image, these variations being typically in a ratio of the order of one to ten On a In the heterogeneous image sequence, the number N can vary considerably from one image to the next (temporal instability), as well as between different zones of the same current image (spatial instability). , characteristic A illustrates an example of variation of the number N of points of interest detected by a conventional FAST algorithm, for a sequence of images captured by the front camera of a car traveling in an urban environment: in particular - In some cases, the number N of points of interest detected can reach or even exceed 7000 to 8000 points of interest while in other cases (for example the region T corresponding to the crossing of a tunnel), this number can suddenly descend below 2000 points of interest, or even less than 1000 points of interest, to then abruptly go up to the exit of the tunnel (zone S) with values of 7000 to 8000 points of interest. As explained in the introduction, these extreme variations in the number of points of interest detected considerably degrade the overall performance of the algorithms operating downstream (block 16 in FIG. 1) and unnecessarily occupy important computing resources. The technique of the invention consists in dynamically and adaptively modifying the detection threshold z-, by analyzing the current image (block 18) and applying it to the detection of the points of interest in the next image (block 14) of the threshold T ai nsi determined. [0010] The algorithm of the invention is based on a modeling of the characteristic giving the number N of points of interest detected as a function of the threshold T. Typically, this modeling is performed by a decreasing exponential function, in particular a root function. square form: N (7-) = C exp} Tétant the value of the detection threshold, brought back to a homogeneous quantity with the pixel: indeed, according to the detectors the threshold conventionally defined can be a homogeneous quantity with the square pixel (by example for a SURF detector), or even at the fourth power pixel (Harris detector). In this case, it is first necessary to reduce the threshold to a homogeneous quantity to the pixel or, similarly, to modify the above function, for example by replacing the "square root" function with the "fourth root" function, for a SURF detector. It should be noted that this model only requires the determination of two parameters (C and u) and that it is therefore possible to carry out a rapid and simple interpolation of the function from only two measurements, that is to say say two iterations of the point of interest detection algorithm. [0011] The decreasing exponential function N (r) above is graphically represented in FIG. 4. Advantageously, a first detection of the points of interest of the current image is performed for a threshold ri, for example TI = 20, giving a number N1 of points of interest. The next iteration is operated with a threshold r2, giving a second number N2 of points of interest. Very advantageously, the threshold T2 is greater than the threshold ri, which makes it possible to reduce the second search for points of interest to only the points of interest (in number N1) already determined by the first iteration. Preferably, the second threshold T2 is shifted from the first threshold ri by a constant value, for example T2 = T1 + 10, by a value guaranteeing a robust and fast interpolation. [0012] Alternatively, the increment between r1 and r2 can be dynamically changed according to the estimated curvature of the feature, which will provide a marginal gain. From the two values N1 and N2 obtained, we can easily determine the parameters 6 and C of the characteristic: (VT1 / 72 In2 (Ni) = Nl exp The process can be possibly improved by carrying out a third (or more) search for determine the parameters 6 and C of the characteristic with a better precision, with a number of measuring points greater than 2. The function N (r) can thus be parameterized for the current image As a function of the ideal number i of the points d interest that one wishes to obtain during the analysis of the following image, the corresponding ideal threshold T will be given by:> C 0- in- The result obtained over the successive images of the sequence captured by the camera is illustrated in B in Figure 3: as can be seen, the actual number N of points of interest detected remains extremely close to the fixed ideal number here at r = 2000 points of interest, and this even for a picture sequence extremely hey This result is compared with the number N obtained with conventional techniques (curve A) for the same sequence of captured images, which number could vary in considerable proportions, typically from less than 1000 to more than 8000 points. of interest according to the images. An improvement of the invention consists in making a prediction of the optimal detection threshold not only from one image to the next, but also, within the same image, on different, successively increasing resolutions of the same image. For this purpose, the algorithm uses, as shown in FIG. 5, a multiresolution representation of the current image of the "image pyramid" type. The general principle of the image pyramid is, for example, set forth by: [5] G. Klein and DW Murray, "Parallel Tracking and Mapping for Small AR Workspaces," ISMAR, 2007. In this case, for example, resolutions with successive scales in a linear ratio of 1: 2 each time, with: - an initial level (full image) VGA (640 x 480); - a level 1 (reduced, but fine resolution) QVGA (320 x 120); - a level 2 (average resolution) QQVGA (160 x 120); and - a level 3 (coarsest resolution) QQQVGA (80 x 60). This example is of course not limiting, and it would be possible to operate starting for example with a 720p HD resolution. The multiresolution representation makes it possible to take into account the erratic impact of the automatic adjustment systems of the exposure of the cameras used. Indeed, a modification of the exposure adjustment can easily lead to a variation of the simple to double the number of points of interest detected even though, visually, the two successive images are quite similar and correspond approximately to the same scene captured. This phenomenon introduces an important dynamic effect to which the FAST type algorithms are particularly sensitive, insofar as they do not perform normalization according to the contrast. The pyramidal multiresolution representation makes it possible to adapt the prediction of the threshold taking into account the fact that the effects such as the automatic exposure settings produce identical consequences on each level of the pyramid. More precisely, the algorithm uses the properties of the image on a coarse scale as a predictor for these same properties at finer scales (inter-scale linkage), and the properties of the image at a given scale for the image. current will be used as a predictor for the same properties at the next image, at the same scale (inter-image link). The coarse scale variations can be very quickly incorporated into the prediction for later pyramid levels. [0013] This sequence of steps is detailed in Figure 6: after collection of the image (block 30) and calculation of the pyramid of scales (block 32), the algorithm calculates (block 34) the threshold to be applied to the maximum scale (the coarsest) from the optimal threshold value previously determined, for this same scale, to the previous image (block 36). The detection of the points of interest is then performed (block 38) with this detection threshold value, thus making it possible to calculate the parameters C and o- of the exponential model and thus determine the optimal threshold r for the current image (block 40 ). [0014] These operations are reiterated at the lower level scale (blocks 34 ', 38', 40 ') from i) of the threshold value determined for this same level on the previous image (block 36) and ii) of the optimum threshold value just determined (step 40) for the current image at the next higher scale. [0015] These calculations are reiterated for the successive levels of the pyramid, down to the lowest level (blocks 34 ", 38", 40 "), thus giving the optimal number of points of interest detected and regulated for each scale ( block 42) In general terms, for the image t on the scale k, we search for the threshold to be applied as a function of the optimal threshold of this same scale k of the previous image t-1 and of the couples {optimal threshold, applied threshold} of the current image rates higher scales q> k: f I ". ) q> k) where: - the index t denotes the number of the image, - the exponent k designates the scale, - St designates the threshold applied for the image of index t at the scale k, and st denotes the optimal threshold for the image of index t at the scale k. For example the function. f can be (with Akg to be determined): St = "k (stq - stq q> k or else: St k + 1 k ^ k St = St-1 kE-1 St Alternatively, one could use the parameters of adjustment of the camera 10 (self-exposure, white balance ...) to the image t, noted Pt: St = f Figure 7 gives an example of characteristics N (r) for the successive scales VGA ... QQQVGA (characteristics (a) to (d), respectively), the points indicating the values computed by modeling and the circles indicating the real data, which makes it possible to observe an excellent correspondence between the model and the reality for the four scales In Figure 8, there is illustrated for a sequence of images the variations of the number N of points of interest detected at different resolutions VGA ... QQQVGA, under the following conditions: - in (a ) for a constant threshold set at r = 20, corresponding to a conventional FAST algorithm; - in (b) with predic inter-image detection of the detection threshold (temporal prediction), according to the invention; In (c) according to the same technique as in (b), but with the further application of a multiresolution representation and an inter-scale prediction (spatial prediction) as explained above with reference to FIG. 6. FIG. 9 is a similar representation showing: in (a): the variations of the number N of points of interest detected at the different resolutions VGA ... QQQVGA by a conventional technique with a constant threshold r = 25 ; in (b): these same variations in the number of points of interest, detected by a technique according to the invention by applying as the detection threshold the optimum value calculated for the preceding image; and in (c), the instantaneous variations, image by image, of the corresponding threshold r. [0016] Finally, to compensate for the problem mentioned in the introduction of spatial instability, the technique of the invention lends itself very well to the application of a bucketing technique of dividing the current image into a plurality of sub-images. independent of small size, and to apply the detection algorithm of points of interest independently to the different sub-images. The principles of bucketing are for example described in: [6] R. Voigt, J. Nikolic, C. Hurzeler, S. Weiss, L. Kneip, and R. Siegwart, "Robust Embedded Egomotion Estimation", 2011 IEEE / RSJ Interna Conference on Intelligent Robots and Systems (IROS), 2011, pp. 2694-2699. [7] B. Kitt, A. Geiger, and H. Lategahn, "Visual Odometry Based on Stereo Image Sequences with Ransac-Based Outlier Rejection Scheme," 2010 IEEE Intelligent Vehicles (IV) Symposium, June 2010, pp. 486-492. Thus, as illustrated in FIG. 10, instead of searching for example 2000 points of interest on the complete image 50 in VGA resolution, this image is divided into 6 × 4 sub-images 52 and each sub-image 2000 / (6 x 4) - 83 points of interest. [0017] Likewise, instead of searching for 1000 points of interest in an image 54 in QGVA resolution, this QVGA image is divided into 3 x 2 sub-images 56 and 167 points of interest are searched for in each sub-image. The predicted and used thresholds will be different for each sub-image and for each resolution (by applying the technique of the pyramid of scales explained above). More precisely, in the inter-resolution relation, the result obtained at the L-1 scale will be used in the corresponding sub-image, for example the four subfragments 52 framed in dashed lines of the VGA resolution will use the sub-image 56 at the top left of the QVGA scale for the calculation of the threshold. Note that the integration of a bucketing in the detection of points of interest with the algorithm of the invention is done without significant additional cost in terms of computing time and computing resources, which allows in other words to guarantee at no extra cost an optimized distribution of the points of interest in the different regions of the image, thus making it possible to solve the problem evoked in the introduction of the spatial instability, encountered with the algorithms of detection of conventional points of interest.
权利要求:
Claims (13) [0001] REVENDICATIONS1. A method for detecting points of interest in a digital image of an image sequence (12) of a scene captured by a camera (1 0), this method implementing a point search algorithm (14) parameterized by a detection threshold (r) such that the number (N) of points of interest detected in the image varies according to the level of the threshold, this method being characterized by a modeling by a decreasing exponential function of the characteristic giving the number (N) of points of interest detected as a function of the threshold (r), this exponential function being dynamically parameterizable by values (C, a) linked to an image to be analyzed, and in that it comprises the following steps: a) determining (18), for a current image, parameterization values (C, a) of the decreasing exponential function; b) prediction (18), for said current image, of an optimum value of the detection threshold (r) by using the modeled characteristic, parameterized with the values determined in step a); and c) applying (14), for at least one image according to said current image, the point of interest search algorithm with the optimal detection threshold value (r) calculated in step b). [0002] The method of claim 1, wherein said optimal detection threshold value (r) predicted in step b) is a value corresponding to a given number (N) of points of interest, as indicated by said characteristic giving the number of points of interest detected as a function of the threshold. [0003] 3. The method of claim 1, wherein said decreasing exponential function is an exponential function decreasing in square root: z- being the value of the detection threshold, reduced to a homogeneous amount at the pixel, where N is the number of points of interest detected for a threshold r, and C and o- being said parameter values related to the current image. [0004] 4. The method of claim 1, wherein the step a) of determining, for a current image, the parameterization values of the exponential function comprises the following substeps: al) a first search for points of interest in the current image by the search algorithm (14) with a first predetermined value (zi) of the detection threshold, resulting in a first number of points of interest (N1); a2) at least one second search for points of interest in the current image by the search algorithm (14) with a second value (r2) of the detection threshold greater than the first predetermined value (ri) of the threshold detection, resulting in a second number of points of interest (N2); and a3) determining the parameterization values C and cy from the numbers of points of interest (Ni, N2) obtained in steps a1) and a2). 20 [0005] The method of claim 1, further comprising, during the iterative execution of steps a) and b): - applying self-exposure information from the camera as complementary input data for the prediction of the detection threshold. 25 [0006] The method of claim 1, further comprising: - dividing the current image (50; 54) into a plurality of sub-images (52; 56) of reduced size; the execution of steps a) and b) independently for the different sub-images of the current image, resulting in an optimal detection threshold value specific to each subimage; and the execution of step c) with the application, for each sub-image of the following image, of the points of interest search algorithm with the optimum value of the respective detection threshold specific to this sub-image. 35 picture. [0007] The method of claim 6, wherein the optimum detection threshold value is a value corresponding to the same predetermined number of points of interest for all subpictures. [0008] 8. The method of claim 1, further comprising: - prior to step a), the production (30, 32) of a pyramid-like multiresolution representation of images, modeling the current image of the captured scene at different successively increasing resolutions (QQQVGA, QQVGA, QVGA, VGA); and the iterative execution of steps a) and b) for each level of the multiresolution representation (34, 38, 40) starting with the lower resolution level (QQQVGA), the detection threshold value determined for a given level. being applied as an input for the prediction of the detection threshold at the higher resolution level. [0009] 9. The method of claim 8, further comprising: - performing step c) with applying, for each level of the multiresolution representation of the next image, the search algorithm of points of interest with the optimum value of the respective detection threshold specific to this level. [0010] 10. The method of claim 8, further comprising, during the iterative execution of steps a) and b) for each level of the multiresolution representation: - the application, as complementary input data for the prediction of the detection threshold at the higher resolution level, the optimal thresholds and the applied thresholds corresponding to the lower resolution levels. [0011] 11. The method of claim 8, further comprising, during the iterative execution of steps a) and b) for each level of the multiresolution representation: the application, as complementary input data for the prediction of the threshold detection at the higher resolution level, self-exposure information from the camera. [0012] The method of claim 8, further comprising: - dividing the images at the different levels of the image pyramid (50; 54) into a plurality of sub-images (52; 56) of reduced size; the execution of steps a) and b) independently for the different sub-images, resulting in an optimum detection threshold value specific to each subimage; and the execution of step c) with the application, for each sub-image of the following image, of the points of interest search algorithm with the optimum value of the respective detection threshold specific to this sub-image. picture. [0013] The method of claim 12, wherein the optimal detection threshold value is a value corresponding to the same predetermined number of points of interest for all subpictures.
类似技术:
公开号 | 公开日 | 专利标题 EP2930659B1|2016-12-21|Method for detecting points of interest in a digital image US9530079B2|2016-12-27|Point spread function classification using structural properties US20070127819A1|2007-06-07|Method and apparatus for object detection in sequences EP2491532A1|2012-08-29|Method, computer program, and device for hybrid tracking of real-time representations of objects in image sequence FR3011368A1|2015-04-03|METHOD AND DEVICE FOR REINFORCING THE SHAPE OF THE EDGES FOR VISUAL IMPROVEMENT OF THE RENDER BASED ON DEPTH IMAGES OF A THREE-DIMENSIONAL VIDEO STREAM EP1746486B1|2009-12-16|Process for detecting movement of an entity fitted with an image sensor and a device for implementing it EP2839410B1|2017-12-13|Method for recognizing a visual context of an image and corresponding device Wu et al.2019|VisionISP: Repurposing the image signal processor for computer vision applications EP3572976A1|2019-11-27|Method for processing a video image stream EP1998288A1|2008-12-03|Method for determining the movement of an entity equipped with an image sequence sensor, associated computer program, module and optical mouse. EP1995693A1|2008-11-26|Method for detecting a moving object in a stream of images Onzon et al.2021|Neural auto-exposure for high-dynamic range object detection EP2943935B1|2017-03-08|Estimation of the movement of an image FR2950451A1|2011-03-25|ALGORITHM FOR DETECTION OF CONTOUR POINTS IN AN IMAGE FR3038760A1|2017-01-13|DETECTION OF OBJECTS BY PROCESSING IMAGES EP1746487A1|2007-01-24|Process and device for detection of movement of an entity fitted with an image sensor FR3025683A1|2016-03-11|METHOD AND DEVICE FOR ESTIMATING DIGITAL IMAGE EQUALIZATION PARAMETERS EP3822912A1|2021-05-19|Segmentation of images by optical flow WO2018109372A1|2018-06-21|Method for digital image processing WO2019081587A1|2019-05-02|Image restoration method FR3013481A1|2015-05-22|SYSTEM AND METHOD FOR CHARACTERIZING OBJECTS OF INTEREST PRESENT IN A SCENE WO2021105604A1|2021-06-03|Method and device for processing images FR3051066A1|2017-11-10|METHOD FOR RESTORING IMAGES Balure2021|Guidance-based improved depth upsampling with better initial estimate EP3707676A1|2020-09-16|Method for estimating the installation of a camera in the reference frame of a three-dimensional scene, device, augmented reality system and associated computer program
同族专利:
公开号 | 公开日 JP2015201209A|2015-11-12| US9251418B2|2016-02-02| US20150294152A1|2015-10-15| EP2930659A1|2015-10-14| FR3019924B1|2016-05-06| CN104978738A|2015-10-14| EP2930659B1|2016-12-21|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题 US20130279813A1|2012-04-24|2013-10-24|Andrew Llc|Adaptive interest rate control for visual search| FR2951565B1|2009-10-20|2016-01-01|Total Immersion|METHOD, COMPUTER PROGRAM AND REAL-TIME OBJECT REPRESENTATION HYBRID TRACKING DEVICE IN IMAGE SEQUENCE| US8873865B2|2011-10-10|2014-10-28|Qualcomm Incorporated|Algorithm for FAST corner detection| JP5993233B2|2012-07-11|2016-09-14|オリンパス株式会社|Image processing apparatus and image processing method| CN103218825B|2013-03-15|2015-07-08|华中科技大学|Quick detection method of spatio-temporal interest points with invariable scale|US9524432B1|2014-06-24|2016-12-20|A9.Com, Inc.|Fast interest point extraction for augmented reality| CN106815800B|2015-12-02|2020-12-08|谷歌有限责任公司|Method and apparatus for controlling spatial resolution in a computer system| IL243846A|2016-01-28|2020-11-30|Israel Aerospace Ind Ltd|Systems and methods for detecting imaged clouds| US9936187B2|2016-05-18|2018-04-03|Siemens Healthcare Gmbh|Multi-resolution lightfield rendering using image pyramids| CN109598268B|2018-11-23|2021-08-17|安徽大学|RGB-Dsignificant target detection method based on single-stream deep network|
法律状态:
2015-04-24| PLFP| Fee payment|Year of fee payment: 2 | 2016-04-22| PLFP| Fee payment|Year of fee payment: 3 | 2016-11-11| TP| Transmission of property|Owner name: PARROT DRONES, FR Effective date: 20161010 | 2017-04-18| PLFP| Fee payment|Year of fee payment: 4 |
优先权:
[返回顶部]
申请号 | 申请日 | 专利标题 FR1453161A|FR3019924B1|2014-04-09|2014-04-09|METHOD FOR DETECTING POINTS OF INTEREST IN A DIGITAL IMAGE|FR1453161A| FR3019924B1|2014-04-09|2014-04-09|METHOD FOR DETECTING POINTS OF INTEREST IN A DIGITAL IMAGE| US14/661,794| US9251418B2|2014-04-09|2015-03-18|Method of detection of points of interest in a digital image| EP15162364.2A| EP2930659B1|2014-04-09|2015-04-02|Method for detecting points of interest in a digital image| CN201510163682.2A| CN104978738A|2014-04-09|2015-04-08|Method of detection of points of interest in digital image| JP2015079671A| JP2015201209A|2014-04-09|2015-04-09|Method for detecting points of interest in digital image| 相关专利
Sulfonates, polymers, resist compositions and patterning process
Washing machine
Washing machine
Device for fixture finishing and tension adjusting of membrane
Structure for Equipping Band in a Plane Cathode Ray Tube
Process for preparation of 7 alpha-carboxyl 9, 11-epoxy steroids and intermediates useful therein an
国家/地区
|