西班牙专利ES2649056A1 Detection and measurement system for fish and mobile objects underwater (Machine-translation by Goog

专利PDF首页>>西班牙专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
System for detecting and measuring fish and mobile objects under water, comprising a 3d scanner (1) for measuring the distance of the points of a scene (6) in an aquatic environment (7) by means of light sensors; a data processing unit (5) configured to generate information on the position and size of the mobile objects detected in the aquatic environment (7) from the signals received by the 3d scanner (1); and a user control unit (4) for the representation on a screen (8) of the position and size data of the detected objects. It is a non-invasive system for the species where the detected fish are correctly followed until they are no longer visible, they are separated in the case of being overlapped and they are measured correctly. (Machine-translation by Google Translate, not legally binding)
公开号:ES2649056A1
申请号:ES201600585
申请日:2016-07-07
公开日:2018-01-09
发明作者:Juan Ramón Rabuñal Dopico；Jerónimo PUERTAS AGUDO；Alvaro Rodriguez Tajes；Ángel Jose RICO DIAZ；André CONDE VÁZQUEZ；Adrián PALLAS FERNÁNDEZ
申请人:Universidade da Coruna；
IPC主号:

专利说明:

5
10
fifteen
twenty
25
30
35
SYSTEM OF DETECTION AND MEASUREMENT OF FISH AND MOBILE OBJECTS UNDER THE
WATER
DESCRIPTION
Field of the invention
The present invention is included within the field of underwater object detection systems, especially fish, individually and in real time.
Background of the invention
Traditionally, the detection of underwater objects, and specifically the detection and quantification of fish, is done by sonar or camera.
The sonar methods are normally designed to be part of a boat or underwater vehicle and are focused on detecting schools of fish for fishing. In JP2006003159, a sonar method is also used to detect the passage of fish in a section of a river prepared to measure.
On the other hand, the methods based on the use of photographic cameras are very invasive methods since they force the fish to pass through different compartments or conduits. For example, in patent document WO2012038415 fish are extracted from a tank through a conduit that directs them to a canalization where the camera is located along with other compartments, to finally direct the fish to a new tank.
However, current methods of detecting objects and fish in circulation under water do not allow real-time measurement and monitoring of them, in addition to the detection and monitoring of fish when they are overlapping with each other. The present invention allows solving this problem.
Description of the invention
The invention relates to a system for detecting and measuring fish and objects underwater. The system includes:
- A 3D scanner for measuring using light sensors from the distance of the
5
10
fifteen
twenty
25
30
points of a scene in an aquatic environment,
- A data processing unit configured to generate information on the position and size of mobile objects, preferably fish, detected in the aquatic environment from the signals received by the 3D scanner.
- A user control unit for the representation on a screen of the position and size data of the detected objects.
The 3D scanner can be a structured light scanner or a flight time scanner.
The data processing unit may be configured to determine the presence or absence of fish in the objects detected in the scene, and the concentration and size of the fish.
In a preferred embodiment the aquatic environment collected in the 3D scanner scene is located inside a transparent wall channel, the 3D scanner being positioned outside said channel at a certain distance. The data processing unit is configured to obtain the actual distance to the objects inside the channel by corrections of the measurements made by the 3D scanner based on that distance.
For the measurement of the size and position of an object the data processing unit is preferably configured to:
- Extract the contour of the object using segmentation techniques from a depth image obtained from the 3D scanner.
- Obtain an ellipse that minimizes the distance to the contour points.
- Obtain the extreme points of the two axes of said ellipse.
- Determine the contour points closest to the end points of the ellipse axes, obtaining the end points of the object.
The data processing unit is configured to detect the size of an overlapping object by:
- Obtaining the continuous contours of objects in a depth image of the
scanner.
5
10
fifteen
twenty
25
30
- Extraction of individual silhouettes corresponding to said continuous contours.
- Combination of individual silhouettes to obtain a combined silhouette.
- Obtaining the contour of the combined silhouette by applying the convex envelope.
- Obtaining the ellipse that minimizes the distance to the points of the convex envelope.
- Determine the contour points of the combined silhouette closest to the extreme points of the axes of the ellipse, obtaining the extreme points of the object.
The data processing unit may be configured to classify the detected objects as fish by a vector support machine. The characteristics used to perform the classification include the area of the convex envelope, the area of the silhouette, the length in the real space of the silhouette and the width of the real space of the silhouette.
The data processing unit may preferably be configured to track objects in the scene in consecutive images captured by the 3D scanner taking into account the characteristics and position of the silhouettes of the objects in the different images.
The present invention can be used to monitor vertical slit scales, one of the most used structures to overcome obstacles such as dams, hydroelectric plants and others. Knowing the frequency with which the fish cross these types of structures can help to know their effectiveness, as well as to know the migratory characteristics of the species, determine if the river course is healthy or determine if it is possible to fish with conservation guarantees. and wildlife improvement.
Brief description of the drawings
A series of drawings that help to better understand the invention and that expressly relate to an embodiment of said invention that is presented as a non-limiting example of this is described very briefly below.
Figure 1 represents an example of assembly of the system components.
Figures 2A, 2B and 2C show the steps of extracting the ends of the object in the
5
10
fifteen
twenty
25
30
35
measurement process
Figures 3A and 3B represent, respectively, the measurement of the end points of the object and the measurement of the object.
Figures 4A-4D represent, in an overlap of two fish, the depth of the region after histogram reduction (Figure 4A), the edges detected by Canny {Figure 4B), the joined edges (Figure 4C), and the silhouetted silhouettes (Figure 4D).
Figures 5A and 5B show obtaining the shape of the new silhouette combined when there is overlap: contour of the combination (Figure 5A) and points selected for measurement (Figure 5B).
Figure 6 illustrates a diagram of use cases.
Figure 7 shows a sequence diagram of the main loop.
Detailed description of the invention
Figure 1 shows a display of the real-time detection and measurement system of fish and objects underwater according to a possible embodiment of the present invention. The system comprises a 3D sensor or scanner 1, preferably laser, responsible for measuring the distance of the points of a scene 6 in an aquatic environment by means of light sensors.
The 3D scanner 1 is positioned outside a channel 3 of transparent walls (eg glass), within which the aquatic environment is located 7. The 3D scanner 1 could also be placed inside the aquatic environment 7, inside a sealed box, at a minimum distance of 40 centimeters of free space to the place where the detection is to be carried out. As shown in the example of Figure 1, the scene 6 collected by the 3D scanner 1 encompasses a section of the glass channel 3 of the aquatic environment 7 where the system performs the detections and measurements. As the aquatic environment 7 under normal conditions has low brightness, the 3D scanner preferably uses infrared light for the detection of objects, thus avoiding the use of additional light sources.
The system also comprises a data processing unit 5 responsible for
5
10
fifteen
twenty
25
30
35
analyze the signals received by the 3D 1 scanner and obtain, from said analysis, information on the position and size of the objects detected underwater.
Finally, a user control unit 4 is responsible for displaying on a screen 8 the position and size data of the detected objects. The user control unit has an interface from which a user can configure the measurement system. The set of data processing unit 5 and user control unit 4 can be implemented for example in a computer.
The 3D scanner 1 is located at a certain distance D from! Channel 3 glass. The distance D can be modified and configured through the user control unit 4 interface, so that a correction of the measurements due to the effects of water refraction can be made. The section of channel 3 selected for the placement of the 3D scanner 1 corresponds to a place where the fish 2 pass close to the sensor or 3D scanner 1, at a maximum distance determined depending on the type of sensor selected. The 3D scanner 1 used can be a structured light scanner or a flight time scanner.
In a possible realization the 3D scanner 1 is located approximately 65 cm from the walls of the channel 3. In that position, the structured light scanner can detect fish 2 and moving objects at a maximum distance of about 40 cm from the glass, while the flight time scanner can detect objects at a greater distance, about 80 cm from the walls of channel 3. If these maximum distance restrictions are met, all fish can be detected and measured correctly by the 3D 1 scanner, moving the data to the processing unit 5 which in turn will be connected to the user control unit 4.
The detection and measurement system of the present invention works in real time, without delays, and at a minimum of 10 images per second. The system is capable of performing the following functions:
-Detection of mobile objects within channel 3.
-Measurement of the dimensions of the detected objects.
-Detection or identification of fish 2 visible in scene 6 when they are not overlapping.
-Simple tracking of fish 2 (without overlaps).
-Detection or identification of fish 2 when they are overlapping (extraction of all overlapping fish).
5
10
fifteen
twenty
25
30
35
-Full tracking of fish 2 (with overlaps).
The 3D 1 scanner, either structured light or flight time, obtains a point cloud with the depth of the scene using light sensors.
The 3D structured light scanner measures the distance of each of the points in scene 6 through the projection of an infrared light pattern that is captured by a camera, for which it uses an infrared emitter and an infrared camera. Additionally, it has an RGB camera that will be used to fine tune the detection in case the light conditions allow it, as well as to show a clearer image of the detection for the end user.
The 3D flight time scanner uses a different method to calculate the distances of each of the points in scene 6. In this case, by emitting light pulses and measuring the round trip time of the Light pulses determine the distance of the points in scene 6 to the sensor. A laser diode emits a pulse of light and the time that passes until the reflected light is seen by a detector is timed. This type of 3D scanner has a better resolution and quality of the captured images.
Next, the process of detecting and measuring fish and moving objects underwater with each of the two 3D scanners is described in detail.
The structured light scanner uses 3 types of frames or images, a color image, an infrared image and another depth image, with a frequency of 30 images per second. These three types of images cannot be obtained simultaneously, but there are two modes with which you can work, the night mode in which the depth image and the infrared image are acquired, and the normal mode in which it is acquired Color image and depth image.
The combination between the infrared image and the depth image is trivial, since the depth image is calculated directly from the infrared image and each pixel l (N, M) of the infrared image represents the same point as its homologous pixel P (N, M) in the depth image, so overlapping both images you can perform operations with them or obtain a new image with two channels.
5
10
fifteen
twenty
25
30
However, the combination between the color image and the depth image cannot be performed directly since the infrared camera is displaced with respect to the color camera, so this combination has to be made from a function that associates the position of a pixel and its value in the depth (distance) image with its position in the color image. The 3D scanner provides a function that, from both images, generates an image where both are matched.
The 3D structured light scanner 1 is calibrated before starting the detection process. Taking into account that the sensor underestimates the real distance to the object when there is water in between, it is considered a function for the calculation of the real distance to a point P1 knowing the distance to the section of channel 3 examined and the measurement made by the sensor. This function is useful for making corrections, even if it is actually an approximation.
Dist (Sensor, P1) = Dist (Sensor, Crystal) + [Measurement (PI) - Dist (Sensor, Crystal)] x 1.35
The constant obtained from the measurements for the correction of the distance traveled by the light inside the water is very similar to the index of refraction of the water so it is understood that the cause of the sensor underestimates the distance is the refraction of the water. In the example considered in Figure 1, the solution has been approximated by observing the degree of visibility of the fish in the images obtained, obtaining the following conclusions:
-Optimal distance to the channel glass: 60 to 70 centimeters.
- Optimum angle (Y axis) of 0 C °. If it is necessary to turn the sensor a little because of direct reflections, the rotation must be clockwise.
-It is important to ensure that there is no direct reflection of the infrared beam of light on the lens of the infrared camera. This can be checked by visualizing the infrared image, and it is solved by turning the sensor slightly.
-If the reflection of the glass disturbs the depth image excessively, it is possible to place the sensor in a slightly higher position than the section of channel 3. This does not disturb the measurement of the fish and eliminates the aforementioned reflection, although it has the disadvantage that only the lower half of the image is used for detection.
To measure objects with the 3D structured light scanner it is not enough to obtain the distance of the object, since it is also necessary to know its position in the
5
10
fifteen
twenty
25
30
X and Y axes. The sensor provides a function that, from a point in the depth image, returns its position in the real world in Cartesian coordinates with respect to the sensor.
To carry out the automatic measurement of an object, it is necessary to obtain the automatic ends of the object. Taking into account that the object has not yet been detected (later it will be used in detected objects), it is necessary to extract the fish mask {image where the pixels corresponding to the fish are illuminated and the rest not) manually starting from the depth image, and from it perform the automatic calculation of the ends.
The method consists of extracting the contour 9 of the object (Figure 2A) from the mask (for example, using the “findContours” function of the OpenCV artificial vision library) and looking for the ellipse 10 (Figure 2B) that minimizes the distance to the points of the contour 9 obtained (for example, by means of the "fitEllipse" function of OpenCV), after which the four end points 11 of the two axes of the ellipse 10 are obtained, as shown in Figure 2C.
Subsequently, the end points 13 of the object (Figure 3A) and their position in the real space are obtained. For this, the contour points 9 closest to the end points 11 of the axes of the ellipse are searched and these are assigned as end points 13 of the object. The "findContours" function of OpenCV returns contoms belonging to the object, so obtaining the value of the end points 13 of the object in the depth image, the distances of these four points 13 to the piano of the sensor are obtained.
With the distance of the end points 13 to the sensor and its position in the depth image, the coordinates of the points with respect to the sensor in the real space are obtained, using for example the "NuilmageTransform" function, and with it the measurement of the object (Figure 3B).
The correction reduces the error in the measurements, so it is advisable to use it. For this reason, the possibility of the user entering the system, using the user control unit interface 4, the distance at which channel 3 is located is enabled
5
10
fifteen
twenty
25
30
with respect to the 3D 1 scanner to be able to make the correction (in the case of not having this information a standard measurement will be made, without corrections).
Once the study of the data provided by the 3D structured light scanner 1 and its calibration is performed, a subtraction of the fund must be carried out in order to be able to center the resources in the regions where there is movement, which will be those that do not belong to the background and where the objects of the scene are located.
Background subtraction is a discipline widely known in the field of artificial vision and is used as preprocessed in numerous problems of object detection. This discipline is based on filtering those areas of the scene that belong to the background. This method allows you to discard much of the information, obtain the shapes of the objects in the scene and focus the detection resources on the areas where they are located. The main problem is the obtaining of a mask in which only the pixels belonging to the objects of the scene are illuminated using as input the background and an image of the scene. In certain situations, in which the background cannot be passed as input to the algorithm, there is the problem of background learning, common in approaches where there is a low control over the scene, this is dynamic or the sensor changes its position.
Once the background is learned, the next step is to generate a mask that represents the movement. The procedure used is known, and consists of subtracting the background from the image and staying with those pixels that exceed a certain threshold (in a realization it has been assigned in 4). Due to the characteristics of the depth image, some restrictions have been added to the algorithm:
- If the background is 0 and the image is different from 0 there is movement.
- If the image is 0 there is never movement.
- There is movement if the image is closer to the background and not vice versa.
The application of high thresholds (greater than 8) considerably reduces the noise in the mask, but produces errors within the figure of the fish. On the contrary, low thresholds produce a large amount of noise in the environment and do not guarantee total success with the fish. The approach that works best is the introduction of the infrared image in the MOG2 algorithm with a threshold of 10, eliminating the noise but leaving the
5
10
fifteen
twenty
25
30
35
silhouettes of the damaged fish, and subsequently the application of a morphological lock operator with a 10x10 elliptic kernel for the recomposition of the fish.
Background subtraction will be done in both images. A combination of them is not carried out at least in this phase. The depth image mask is extremely reliable when the sensor manages to see the fish. On the other hand, the subtraction in the infrared image is less precise but whenever the fish is in the scene it is able to see it. Therefore, the depth mask will be used when undertaking the next detection phases, although the mask used to perform the subtraction in the infrared image can be very useful in tracking (in order to maintain the identity of a fish if it is invisible to the depth image for several consecutive images).
Then the segmentation process is carried out. For this, it is necessary to separate each of the components of the movement regions into smaller components, called silhouettes, which can be whole fish or parts of fish occluded by objects or by other fish. For this task, attention is focused on each of the movement regions.
The first step of the segmentation process is to obtain all the regions of motion and filter those that are too small. To do this you can apply the function "findContours" of OpenCV (which obtains the contours of a binary image) to the input mask, filtering outlines that do not exceed a minimum area of 100 pixels.
Subsequently, for each of the movement regions, the minimum frame containing each of the contours is obtained. From this frame, a depth image of the region cut from the original depth image is obtained and a blob of the region is generated, which is a mask of the size of the frame in which the interior area is marked contour (no remnants of other regions that may be included in the frame).
For each of the movement regions, its blob and its depth image are filtered, which is sufficient if only one object is found in the region. If there are more than one object in the movement region, the objects that are inside them must be separated. For this task the Canny edge detector is used on the
5
10
fifteen
twenty
25
30
35
depth image, taking into account that the margin of error of the depth image for values other than 0 is quite low.
A problem encountered when using Canny on the depth image of the region is that in OpenCV only Canny has been implemented for 8-bit images, so it is necessary to reduce the histogram to go from 16 bits to 8 bits ( Figure 4A), The loss of information is evident, so this reduction is only applied to perform edge detection and then continue working with the 16-bit image. However, to perform a correct edge detection, the reduction of the histogram is performed so that the minimum information is lost, for which the values of the depth image of the region that do not belong to the blob are set to 0 and they obtain the maximum and the minimum (different from 0) of it. If the difference between the maximum and the minimum is less than 255, the maximum is assigned to a minimum of + 255 and then the multiplier used for the reduction is calculated, mult = (max-min) / 255. If this process is not performed, the value of the resulting pixels in the 8-bit image will depend on the heterogeneity of the 16-bit image, and the sense of the data will not be maintained nor can a threshold be set for Canny.
After performing the histogram reduction, Canny is applied (Figure 4B) with a 3x3 Sobel and with thresholds (15 x multiplier) and (30 x multiplier). The result of the process is a mask with edges of the 8-bit image. It is important to note that the histogram reduction sets the minimum values of the image to 0, so if these are located on the edge of the Canny region, you cannot distinguish them from those that do not belong to it, so that the edge of the blob of the region. If this is not done in certain corners, the values of the internal edges are not able to reach the outer edge, so a cut of the silhouette is not performed. To avoid possible loose edges in the corners, which is quite common in this type of problem, a union of loose edges is made (Figure 4C), to obtain continuous contours (15a, 15b, 15c) of objects. For this, a 3x3 window is used in the border mask that looks for those points in which situations occur where the edge point is only connected to the edges of the window by a single path, and those that have two roads but both are adjacent. This is intended to find those edges that have not been joined with others, possibly because they remain loose in a corner because Canny decided to join the other two whose direction was the same.
5
10
fifteen
twenty
25
30
35
The next step consists in extracting interior contours in the closed edge image and filtering those that do not exceed an area of 100. Each of the contours obtained represents an individual silhouette (16a, 16b, 16c), as shown in Figure 4D. Each of the individual silhouettes (16a, 16b, 16c) is stored using a process similar to that performed when filtering regions, the minimum frame containing the silhouette is obtained to make the cut, from which the blob is generated of the silhouette filtered with the initial mask and a depth image (16 bits) filtered with the blob. Apart from this, the contours of the silhouette and its frame with respect to the entire image are also stored.
Once the segmentation is finished, the object detection process begins. If no additional type of filter is applied, there is a possibility that some foreign object is inside the channel or a part of it! background that moves generates a false positive. To prevent this from happening, a detection process is carried out based on obtaining key characteristics of the objects, their classification and the subsequent reconstruction of those that are occluded.
A filter is started using a classifier that discerns when an object is or is not a fish. The classifier used is a vector support machine (SVM, “Support Vector Machine"), a supervised machine learning algorithm that, from a series of characteristics obtained from positive and negative examples, learns a mathematical function capable of discerning whether a new object It belongs to the positive or negative group.
When making the classifications, the false positives are completely eliminated, so the classifier has to be demanding when determining if an object is a fish, and since the sensor sometimes makes invisible parts of the fish it is it is preferable that the classifier does not classify them as such to obtain false positives. The main reason for this approach is that the detection phase has been linked to the monitoring phase, so once a fish is detected it is possible to follow its silhouette even if it is cut and classified as a fish.
The characteristics used to perform the classification are the area of the convex envelope, the area of the silhouette, the length in the real space of the silhouette and the width of the real space of the silhouette.
5
10
fifteen
twenty
25
30
35
The classification process obtains good results in cases where fish are visible and there are no fish crosses. If there is an overlap between fish, get the fish that is in the part closest to the crystal of the channel 3 cleanly, rebuilding the pieces of the silhouettes of the occluded fish.
The reconstruction process is carried out at the region of movement level, so that only reconstruction is carried out within each of the movement regions (fish that have occluded between them), which makes any type of occlusion produced by an object belonging to the fund is not solved in this process.
The first step of the combination is the establishment of a new frame for the combined silhouette, a frame that encompasses the frames of each of the individual silhouettes. After this, it is preceded to obtain the contour 17 of the combined silhouette (Figure 5A), from the combination of the contours of the individual ones that are stored in the new silhouette. Subsequently, the convex envelope 17 of the combination is obtained, which is the convex envelope of both contours. Next, measurements of the combined silhouette are taken, ellipse 18 is obtained which minimizes the error of both contours 17 and their ends (Figure 5B). If you only look for the point of both contours closest to the ends of ellipse 18, mistakes can be made, so an approximation is made obtaining only the depth of the point closest to each of the ends, using the coordinates from the end to calculate the position in space (as if at that point an object with that depth is found).
Finally, a method of tracking the detected objects is implemented, based on the pairing of the objects detected in the image N with those closest to them in the following image (N + 1). The advantage of knowing the position in the real world is used, giving priority to the objects closest to the camera when pairing.
For each detection, a fish signature with the position, height, width and four extreme points 13 of the silhouette is saved, a random color and a counter initialized to a certain value are assigned, for example to 5.
With the arrival of the detections of the second image (N + 1) the system orders the list of stored fish signatures putting at the beginning those that are in a
5
10
fifteen
twenty
25
30
35
position closest to the camera (minor z axis). After ordering, all the detections of the second image are iterated, looking for the one that minimizes the sum of the distances of its extreme points 13.
After this it is verified that the silhouette selected meets the sum of the distances from the endpoints of the same to the endpoints of the firm is less than the sum of the length and width of the fish. If this restriction is met, the new detection is assigned to the signature and the signature is updated with the new position, size and endpoint values. After assigning the signature, or in the case of not complying with the restrictions, the next signature is passed and the process is repeated without using the detections already assigned.
Each time a signature is updated its counter is reset (e.g. a 5) and every image that passes without a signature being referenced the counter decreases by 1. In the case of reaching 0 this signature is removed from the system.
Additionally, a counter is maintained for the number of times the signature has been updated and will not be shown on the screen unless this counter exceeds a certain number, for example the 3 units. This serves to avoid false positives in certain images that make a bad measurement, or if the background subtraction generates artifacts that may look like fish and the detector is not able to filter them. The chances of this happening are extremely low, but using this method the probability of error decreases dramatically.
This approach, together with the excessively demanding classifier, makes it only possible to track when the object is detected. Given the vision problems in e! water by the sensor is very possible that there are some images in which it is not visible correctly. To solve this issue, an approach was implemented that allows us to ignore the classification process for certain silhouettes that have characteristics similar to silhouettes detected in previous images. For this, at the time of detecting the image N + 1, the silhouettes detected in the image N have been introduced, and if there is a silhouette inside the detector that, although not detected, meets the restrictions that the sum of the distances from its extreme points to those of some previous silhouette it is less than the sum of the length and width of the previous silhouette, it leaves the detection process as if it were an object classified as a fish, but with a mark that has not been detected. In the case of not being matched at the end of the process of
5
10
fifteen
twenty
25
30
35
follow-up with no existing signature (because for example another closer one speaks) instead of creating a new signature as it would be done with a detection, the silhouette is discarded.
The monitoring also provides the advantage of being able to take several measurements of the fish, which is very useful since the fish cannot be measured correctly in all the images since it is not always completely visible or is at an angle in the that the camera cannot visualize the full extent of its body. For this reason, a simple estimate of the measurements is made, which consists of keeping an ordered vector with the last measurements of each fish detected (for example, the last 10 or 20 measurements). From these measurements, the length and width are approximated, obtaining the third quartile (there are usually 50% of correct measurements and 10% that overestimate according to the observations) of their vectors. Once the vector reaches 20 measurements, it is pruned by eliminating odd values to return to 10 measures. This reduces the variation of the measurements and when the silhouette is cut, the measurement of the complete monitoring is displayed, which gives the user more truthful information.
Next, the process of detecting and measuring fish and moving objects underwater with the 3D flight time scanner is described. Regarding the measurement of objects with the 3D flight time scanner, the color image is treated, a! as in the structured light scanner, of a matrix of BGRA pixels obtained with a frequency of 30 images per second, with the difference that the resolution of this image is 1920X1080.
The infrared image of this sensor is an array of pixels that represent the reflection of infrared light in the scene served at a frequency of 30 images per second, but this image has completely different characteristics than the structured light scanner. Although the resolution (matrix size) is lower (512X424), the color depth is 16 bits with which 256 times more intensities can be encoded than in the previous one.
This case is a low range so the image of the sensor is very obscured, making it difficult for the human eye to visualize. At the time of detection, this does not have any type of influence, but if you want to show the user this image has been processed in some way to facilitate its visualization. The method used for visualization is the gamma correction, which maps the values of the image with other values
5
10
fifteen
twenty
25
30
35
that extend a part of the histogram, resulting in certain cases in a much more visible image. The correction used is as follows:
Pout = A * PYin
A parameter A of 255 is used to match the 8-bit (0-255) and one and 0.3-bit coding, which extends the dark region of the histogram. Taking into account that this operation has to be performed for each of the pixels 30 times per second, a previously generated table can be used for any value of the 256 * 256 that the image can take, so that this process is achieved much more efficiently.
The method of presenting depth information using the depth image of the 3D flight time scanner is a two-dimensional matrix that represents the distance of each point in the scene and is served 30 times per second. The resolution of the depth image of the flight time scanner is lower than that of the depth image of the structured light scanner, but the effective resolution is higher because the structured light scanner calculates the distance using groups of pixels, while that the flight time scanner does for each pixel individually. The regions in which the depth image of the structured light scanner is classified are maintained in the depth image of the flight time scanner, but its distribution, frequency and size change completely. It should be noted that visibility in the image increases so that virtually all pixels are classified and given a value (few non-stable pixels) but the accuracy of this value is clearly reduced.
The flight time scanner is calibrated prior to the start of the detection process. The minimum range of the flight time scanner is 50 cm and the maximum range is approximately 3 m. The visibility of the objects in the scene, unlike the previous approach, is practically optimal as long as the sensor is placed in a range between 55 and 80 cm of channel 3. It is advisable to look for a position that hinders the reflection in crystals, on the ground or on the surface of the water.
The beginning of the measurement process is identical to the approach described above for the structured light scanner, but once the contour points 13 have been obtained
5
10
fifteen
twenty
25
30
35
Near the ends 11 of the axes of the ellipse 10, these points may be influenced in some way by the noise produced at the edges. To solve this problem, a 5x5 window is used on each of the points to be measured and the median value of the pixels of the window is calculated by filtering those that do not belong to the mask of the object (external points). Subsequently, to obtain the real position of each of the points, a function is used by entering the x and y coordinates of each point used to measure, but with the value of z modified with the median of the region close to the points obtained with the windows. After this the measurement is performed as it was done in the first approach,
For the background subtraction in the depth image the voting algorithm used in the previous approximation with a higher threshold is used (e.g. with a value of 15). The threshold is carefully chosen taking into account that in this deep image the sudden movement of the fish in the water can produce errors in the estimation of the bottom, generating an aura in the mask around the fish.
The uncertainty in the depth image makes the use of infrared information reflected in the scene of great value when it comes to eliminating the haloes that can be produced in the depth mask, for which the background subtraction is performed using the MOG2 algorithm with a threshold of 500. From the mask obtained, a morphological operation of a lock with a 5x5 elliptical mask is applied to make the silhouettes of the fish recover their shape.
The combination of images made in this approach is used to clean the existing haloes in the depth mask. For this task an AND operation is performed between the depth mask and the infrared mask, resulting in a much cleaner mask where the silhouettes of the fish are more easily distinguishable.
The segmentation procedure used is very similar to the segmentation performed in the first approach. The Canny edge detector with a 5x5 Sobel mask and 900-1200 thresholds is used, and then the edges are joined in the same way as in the first approach. The use of a high Canny threshold serves to separate the fish from possible auras that may have occurred in the bottom subtraction and other fish whose edges are clearly delimited.
5
10
fifteen
twenty
25
30
When making the detection, the same method is used as the one used in the first approach, the only difference is that the vector support machine used is obtained with the flight time scanner.
The design of a high-level application is detailed below, following the scheme in Figure 6:
- Start system 30: User 20 selects the sensor with which he wishes to perform the detection. The system initializes and configures its components to work with the selected sensor. It is subsequently launched.
- Obtain image 31: The system accesses the sensor and obtains the images of the current moment.
- Show image 32: User 20 selects the type of image he wants to see and the system repeatedly obtains this type of image and displays it on screen 8.
- Subtract background 33: The system obtains the current images and generates a motion mask (an image where the pixels representing objects in the scene are white and the background is black).
- Change background subtraction threshold 34: User 20 introduces the new threshold that will be used to perform background subtraction.
- Show movement 35: The user 20 selects that he wishes to visualize the background subtraction mask and selects the type of mask, the system continuously performs the background subtraction and displays the resulting mask on the screen.
- Segment regions of movement 36: The system obtains the images and from them obtains the movement mask. Subsequently, the segmentation of the different objects found in the region is performed.
- Configure segmentation 37: User 20 enters the desired configuration for segmentation (high and low Canny thresholds and minimum silhouette area).
- Measure objects 38: The system obtains the objects of the scene and obtains its measurements in real space, its length, width, position and extreme points.
- Enter distance to the glass 39: User 20 enters the distance to the glass to make the correction in the measurements made by the system
- Detect fish 40: The system obtains the measured objects of the scene and classifies those that it considers fish and those that it does not.
5
10
fifteen
twenty
25
30
- Configurer detection 41: User 20 chooses whether or not to use the provided SVM. Additionally, you can enter one trained by it.
- Follow fish 42: The system obtains the existing fish in the scene and tracks them over time, taking into account their characteristics and their position.
- Show image with detections and measurements 43: The user 20 selects the option ‘detect’ while viewing an image and the system shows from that moment the fish detections on the image that was being displayed. Additionally, the measurements of each of them are shown.
With respect to the design of the model, a pipeline architecture with several modules that perform specific tasks is used, where the outputs of some modules serve as the inputs of other modules. A modification has been made to the architecture by adding a facade class that serves to manage the communication of the different modules and to service the user interface. All the design has been constructed from the facade class that is the class 'FishDetector', as shown in Figure 7. The purpose of each of the modules and the main classes within each module is explained below. one of them:
- Fish detector module 50: This is the module that encompasses all modules and model classes. He is in charge of managing the communication between different modules and serves as the facade of the user interface requests.
- Sensor module 51: This module is the one that contains the classes that act as interfaces of the sensors used by the application.
- Motion detector module 52: This module is responsible for storing the classes related to the background subtraction process.
- Segmenting module 53: It is responsible for the segmentation of the masks obtained in regions of movement, subsequently generating the silhouettes. The method used to perform this process is the "Make_segmentation" function.
- Detector module 54: It is the main class of the Detection module and is responsible for classifying the objects obtained from the segmentation process
5
10
fifteen
twenty
25
30
35
in fish or other objects. The method that performs this process is the "Make_detection" function.
- Follower module 55: It is the module that contains the classes dedicated to maintaining the persistence of detections in successive images and tracking the silhouettes of the fish. The method that performs this process is the "Make_Follow" function.
The user interface is designed as a single window in which a container is placed in the center responsible for displaying the images obtained by the sensor and next to it a list of fish detected with their characteristics. Through a series of menus you can change the type of image to be displayed on the interface, in addition to controlling parameters of the detection process, enter data such as the distance to the glass! to allow the correction of the measurement or to select if the detection is desired using the SVM or just filtering by area.
System parameters, such as the background subtractor threshold, Canny threshold or contour filtering threshold, even if a value for the artificial environment has been set must be adjusted to the environment in which the system is started. For fish of much larger size (more than double) the following parameters are modified: the area limit by which a contour is discarded is increased; if you work with fish, twice as large is doubled (200). Canny's thresholds should be increased to almost double (20 and 40) taking into account that when two fish are stuck together as they are twice as wide, it is not necessary to make such a demanding cut.
If the crystal of channel 3 is detected by the reflection of the pattern as long as there is no fish behind it, it is necessary to modify the bottom subtraction to activate the mask not only when the fish is closer to the bottom but also when find further
Regarding the operation of 3D scanners 1, it should be noted that the 3D structured light scanner is limited in terms of vision and range within the water but its measurements are very precise; on the other hand, the 3D flight time scanner is able to easily visualize the fish in the scene but the margin of error of the displayed objects is much greater. For this reason, in relation to measurement, the structured light scanner works much better; instead, the flight time scanner is able to see more
fish and at greater distances, although the measurements it makes are much less precise and has difficulties in separating fish that are stuck, which is because the method used throws a cloud of points with a smoothing that blurs edges and generates intermediate points between two objects that are superimposed. This makes the structured light scanner the most suitable of the two studied when solving the problem, being the one that has provided the best results.

权利要求:
Claims (10)
[1]
5
10
fifteen
twenty
25
30
35
1. System for detecting and measuring fish and mobile objects underwater, characterized in that it comprises:
- a 3D scanner (1) for measuring by means of light sensors the distance of the points of a scene (6) in an aquatic environment (7);
- a data processing unit (5) configured to generate information on the position and size of the mobile objects detected in the aquatic environment (7) from the signals received by the 3D scanner (1);
- a user control unit (4) for the representation on a screen (8) of the position and size data of the detected objects.
[2]
2. System according to revindication 1, characterized in that the 3D scanner (1) is a structured light scanner.
[3]
3. System according to claim 1, characterized in that the 3D scanner (1) is a flight time scanner.
[4]
System according to any one of the preceding claims, characterized in that the objects to be detected by the data processing unit (5) are fish (2).
[5]
5. System according to claim 4, characterized in that the data processing unit (5) is configured to determine the presence or absence of fish (2) in the objects detected in the scene (6), and the concentration and size of the fish (2).
[6]
System according to any of the preceding claims, characterized in that the aquatic environment (7) collected in the scene (6) of the 3D scanner (1) is located inside a channel (3) of transparent walls, the scanner being 3D (1) positioned outside said channel (3) at a certain distance (D), and where the data processing unit (5) is configured to obtain the actual distance to the objects inside the channel (3) ) by corrections of the measurements made by the 3D scanner (1) according to said distance (D).
[7]
7. System according to any of the preceding claims, characterized in that for
5
10
fifteen
twenty
25
30
35
The measurement of the size and position of an object the date processing unit (5) is configured to:
- extract the contour (9) of the object by segmentation techniques from a depth image obtained from the 3D scanner (1);
- obtain an ellipse (10) that minimizes the distance to the contour points (9);
- obtaining the end points (11) of the two axes of said ellipse (10);
- determine the contour points (9) closest to the end points (11) of the axes of the ellipse (10), obtaining the end points (13) of the object.
[8]
System according to any of the preceding claims, characterized in that the data processing unit (5) is configured to detect the size of an overlapping object by:
- obtaining continuous contoms (15a, 15b, 15c) of objects in a depth image of the 3D scanner (1);
- extraction of individual silhouettes (16a, 16b, 16c) corresponding to said continuous contours (15a, 15b, 15c);
- combination of individual silhouettes (16a, 16b, 16c) to obtain a combined silhouette;
- obtaining the contour (17) of the combined silhouette by applying the convex envelope;
- obtaining the ellipse (18) that minimizes the distance to the points of the convex envelope (19);
- determine the contour points (17) of the combined silhouette closest to the end points of the axes of the ellipse (18), obtaining the end points (19) of the object.
[9]
9. System according to any of the preceding claims, characterized in that the data processing unit (5) is configured to classify the objects detected as fish by means of a vector support machine, where the characteristics used to perform the classification comprise the area of the convex envelope, the area of the silhouette, the real space length of the silhouette and the width of! Real space of the silhouette.
[10]
10. System according to any of the preceding claims, characterized in that the data processing unit (5) is configured to track the data
objects in the scene (6) in consecutive imageries captured by the 3D scanner (1) taking into account the characteristics and position of the silhouettes of the objects in the different images.
5

类似技术:

公开号 | 公开日 | 专利标题

Schöps et al.2015|3D Modeling on the Go: Interactive 3D Reconstruction of Large-Scale Scenes on Mobile Devices.

Biskup et al.2007|A stereo imaging system for measuring structural parameters of plant canopies

ES2359852T3|2011-05-27|METHOD AND SYSTEM FOR THE MAPPING OF REACH SENSOR DATA ON IMAGE SENSOR DATA.

CN104715471B|2018-01-02|Target locating method and its device

CN109271944B|2021-03-12|Obstacle detection method, obstacle detection device, electronic apparatus, vehicle, and storage medium

ES2818563T3|2021-04-13|Three-dimensional modeling method and apparatus

CN106871906B|2020-08-28|Navigation method and device for blind person and terminal equipment

US20160093101A1|2016-03-31|Method And System For Generating A Three-Dimensional Model

TW200838750A|2008-10-01|Intelligent driving safety monitoring system integrating multiple direction information and method thereof

EP2339507A1|2011-06-29|Head detection and localisation method

RU2004123248A|2006-02-27|SYSTEM AND METHOD OF TRACKING OBJECT

EP3559740A1|2019-10-30|Real-time tracking for three-dimensional imaging

KR20210036886A|2021-04-05|Harbor monitoring device and harbor monitoring method

ES2625729T3|2017-07-20|Stereoscopic measurement system and method

US20200268251A1|2020-08-27|System and method for patient positioning

CN113392693A|2021-09-14|Object detection under different lighting conditions

CN113994376A|2022-01-28|Detection, three-dimensional reconstruction and tracking of a plurality of rigid objects moving relative to one another

ES2649056B1|2018-09-12|Detection and measurement system of fish and moving objects underwater

CN108460368B|2021-07-09|Three-dimensional image synthesis method and device and computer-readable storage medium

WO2019174484A1|2019-09-19|Charging base identification method and mobile robot

Roessing et al.2013|Intuitive visualization of vehicle distance, velocity and risk potential in rear-view camera applications

KR20210035831A|2021-04-01|System and method for multi-modal detection of depth in a vision system for an automated surgical robot

KR20120091749A|2012-08-20|Visualization system for augment reality and method thereof

Wetzel et al.2018|Towards global people detection and tracking using multiple depth sensors

JP2019121876A|2019-07-22|Image processing device, display device, navigation system, image processing method, and program

同族专利:

公开号 | 公开日

ES2649056B1|2018-09-12|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

WO1994009920A1|1992-10-23|1994-05-11|The Minister Of Agriculture Fisheries And Food In Her Britannic Majesty's Government Of The United Kingdom Of Great Britain And Northern Ireland|Fish sorting machine|

法律状态:
2018-09-12| FG2A| Definitive protection|Ref document number: 2649056 Country of ref document: ES Kind code of ref document: B1 Effective date: 20180912 |

优先权:

申请号 | 申请日 | 专利标题

ES201600585A|ES2649056B1|2016-07-07|2016-07-07|Detection and measurement system of fish and moving objects underwater|ES201600585A| ES2649056B1|2016-07-07|2016-07-07|Detection and measurement system of fish and moving objects underwater|

[返回顶部]