It Mainly Explains the Target Detection Algorithm From the Aspect of Face Detection

In the field of object detection, it can be divided into face detection and general object detection. There are often special algorithms in this aspect (including face detection, face recognition, recognition of other attributes of face, etc.), which are different from general object detection (recognition). This is mainly due to the particularity of face (for example, sometimes the target is small, the features between faces are not obvious, occlusion problems, etc.). This paper will mainly explain the target detection from the aspect of face detection.

At present, the main face detection methods are classification

At present, face detection methods mainly include two areas: traditional face detection algorithm and face detection algorithm based on deep learning. Traditional face detection algorithms can be divided into four categories:

Knowledge-based face detection method;

Face detection method based on model;

Feature based face detection method;

Face detection method based on appearance.

In 2006, Hinton first proposed the concept of deep learning, which is to form higher-level abstract features by combining low-level features. Subsequently, researchers applied deep learning in the field of face detection, mainly focusing on face detection based on convolutional neural network (CNN), such as cascade convolutional neural network based face detection (cascade CNN), multi task convolutional neural network based face detection (mtcnn), Facebook, etc., which greatly improved the robustness of face detection.

Of course, general object detection algorithms such as fast RCNN, Yolo and SSD are also useful in the field of face detection and can achieve relatively good results, but they are still different from special face detection algorithms.

How to detect faces of different sizes in a picture?

In traditional face detection algorithms, there are two main strategies for faces of different sizes:

Zoom the size of the picture (the image pyramid is shown in Figure 1);

Figure 1 image pyramid

Scale the size of the sliding window (as shown in Figure 2).

Figure 2 zoom sliding window

In the face detection algorithm based on deep learning, there are two main strategies for different sizes of faces, but it is a little different from the traditional face detection algorithm, mainly including:

Zoom the picture size: however, you can also zoom the sliding window. The efficiency of the sliding window face detection method based on deep learning will be very slow, and there will be repeated convolution for many times. Therefore, the full convolution neural network (FCN) should be used. The sliding window method cannot be used with FCN.

The method of using anchor box: as shown in Figure 3, it should not be confused with figure 2. Here, the anchor box area of the original figure is predicted through the feature map, which is specifically described in Facebook.

Figure 3 anchor box

How to set the algorithm to detect the minimum face size?

It mainly looks at the minimum window of sliding window and the minimum window of anchor box.

Method of sliding window

Assume pass 12 × If the sliding window of 12 does not scale the original image, 12 in the original image can be detected × The smallest face of 12.

However, the minimum face a = 40 or a = 80 is usually given. It is unrealistic to train CNN for face detection with such a large input, and the speed will be very slow. The next time the minimum face a = 30 * 30 is required, it will be retrained, usually 12 × 12. In order to meet the minimum face frame a, you only need to scale the original image during detection: w = W × 12/a。

Anchor box method

The principle is similar. Here we mainly look at the minimum box of anchor box. The setting of the minimum face can be realized by scaling the input image.

How to locate the face

Sliding window mode:

The sliding window method determines the final face based on the position of the frame recognized as the face by the classifier.

Figure 4 sliding window

FCN mode:

The position finally recognized as a face is determined by mapping the feature map to the original map. Mapping the feature map to the original face box depends on how many times the feature map is scaled compared with the original map (scaling mainly depends on the convolution step size and pool layer).

Assuming the points (2,3) on the feature map, the scaling ratio can be roughly calculated as 8 times, and the points in the original map should be (16,24); If the FCN for training is 12 * 12, the position of the original frame should be (16,24,12,12).

Of course, this is only the estimation of location. Specifically, the prediction of regression box should be added when building the network, mainly a translation and scaling relative to the original frame.

Via anchor box:

The position of the final recognized face is determined by mapping the feature map to the window of the map and mapping the feature map to the original map to multiple boxes.

How to determine the final face frame position through multiple frames of a person's face?

Figure 5 final face position obtained by NMS

There are many improved versions of NMS. The original NMS is to judge the intersection of two boxes. If the intersection is greater than the set threshold, one of the boxes is deleted.

So how do you choose which of the two boxes to delete? Because the model output has a probability value, it is generally preferred to select the box with low probability for deletion.

Face detection based on cascaded convolutional neural network (cascade CNN)

What is the framework of cascade CNN?

There are 6 CNN in the cascade structure, 3 CNN are used for face non face classification, and the other 3 CNN are used for face region border correction.

Given an image, 12 net intensively scans the whole image and rejects more than 90% of the windows. The remaining window is entered into 12-calibration-net to resize and position to approximate the real target. Then input it into NMS to eliminate the highly overlapping window. The following network is similar to the above.

What is the principle of cascade CNN face verification module?

The network is used for window correction and uses three offset variables:

Xn: horizontal translation, yn: vertical translation, SN: aspect ratio scaling.

Among the candidate frames (x, y, W, H), (x, y) represents the coordinates of the upper left point, and (W, H) represents the width and height.

We want to adjust the control coordinates of the window to:

In this work, we have a model. The three parameters of the offset vector include the following values:

At the same time, the three parameters of the offset vector are corrected.

How should training samples be prepared?

Face samples;

Non face samples.

Benefits of cascading

In the initial stage, the network can be relatively simple, and the discrimination threshold can be set loosely, so that a large number of non face windows can be eliminated while maintaining a high recall rate;

In the last stage, in order to ensure sufficient performance, the general design of the network is more complex, but it can ensure sufficient efficiency because it only needs to deal with the remaining windows in front;

The idea of cascade can help us to combine and utilize the classifiers with poor performance, and can obtain a certain efficiency guarantee at the same time.

Face detection based on Multitask convolution neural network (mtcnn)

Mtcnn model has three sub networks, namely p-net, R-Net and o-net.

In order to detect faces of different sizes, it is necessary to build an image pyramid. First, the PNET model is used to output the face category and boundary box (the prediction of the boundary box is to translate and zoom the frame mapped from the feature image to the original image to obtain a more accurate frame). The frame recognized as a face is mapped to the position of the original frame to obtain a patch, After that, each patch is input to rnet through resize to recognize the face frame and predict the more accurate face frame. Finally, each patch recognized by rnet is input to onet through resize, which is similar to rnet. The key point is to make the model more robust when the training set is limited.

Also note that the scale of the image pyramid should be preserved in order to map the bounding box to the original image.

Facebox

(1)Rapidly Digested Convolutional Layers(RDCL)

In the early stage of network, rdcl is used to quickly reduce the size of feature map. The main design principles are as follows:

The stripe of conv1, pool1, conv2 and pool2 are 4, 2, 2 and 2 respectively. In this way, the stripe of the entire rdcl is 32, which can quickly reduce the size of the feature map.

The convolution (or pooling) kernel is too large, the speed is slow, too small and the coverage information is insufficient. After weighing, the core sizes of conv1, pool1, conv2 and pool2 were set to 7x7, 3X3, 5x5 and 3x3 respectively.

The number of convolution kernels is reduced by using clelu to ensure that the output dimension is unchanged.

(2)Multiple Scale Convolutional Layers(MSCL)

In the later stage of the network, MscL is used to better detect faces with different scales. The main design principles are:

Similar to SSD, it is detected at different layers of the network;

The concept module is adopted. Since inception contains multiple different convolution branches, it can further diversify the receptive field.

(3)Anchor densification strategy

In order to balance the anchor density, the anchor with insufficient density can be offset and doubled at the center, as shown in the following figure:

It Mainly Explains the Target Detection Algorithm From the Aspect of Face Detection 1

Hot Suchen
Aminica Hair Brazilian Human Hair Weave Hair 1B/Gray Body Wave 3 bundles 8-30 inch 100% Human Hair Extensions Weft Ombre Color Hair c18121103 Brazilian Hair Weave 100% Virgin Hair Water Wave 4 Pcs/Lot 8-30 inch 100% Human Hair Extensions Weft wholesale Double weft natural color body wave bundle no shedding no tangling Brazilian Human Hair Bundles Natural Color Unprocessed Virgin hair Brazilian Remy Body Wave Hair Bundles brazilian hair weft with factory price "Aminica hair Straight Lace Closure Brazilian Natural Color 4*4 Free/Middle/Three Part virgin Human Hair Closure 8 to 20 Inch" 180% Density Kinky Straight Unprocessed Original Real Human Long Hair Lace Front remy human cuticle aligned virgin hair wholesale 8a 9a 10a 12a grade kc peruvian hair bundles Wholesale Virgin Hair Vendors 1 Aminica Hair Malaysian Human Hair Weave Virgin Hair Loose Wave 8-30 inch 100% Human Hair Extensions Weft 1 Bundle 100g Brazilian Hair kinky curly natural color Full Lace Virgin Hair Wig Raw Unprocessed Lace Human Hair Wigs 13x4 lace front closures hairline body wave 13x4 lace frontals with baby hair
heiße Artikel
The guide of 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Straight 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Weft 3 B
168
ombre hair weft 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Curly Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Wef
165
Aminica Hair 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Deep Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Weft 3 Bun
162
ombre hair weft 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Body Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Weft
161
ombre hair weft 8A Aminica Hair Brazilian Human Hair Weave Virgin Hair Loose Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Wef
158
ombre extension 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Water Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extensions Wef
157
The guide of Aminica Hair Brazilian Body Wave Hair Lace Closure Free/Middle/Three Part Remy Human Hair 4x4 inches Swiss Lace Closur
151
ombre hair straight hair 8A Aminica Hair Brazilian Human Hair Weave Virgin Hair Body Wave 3Pcs/Lot 8-30 inch 100% Human Hair Extens
150
The guide of 6A Aminica Hair Brazilian Human Hair Weave Virgin Hair Water Wave 4Pcs/Lot 8-30 inch 100% Human Hair Extensions Weft 4
146
ombre highlights straight hair Aminica Hair Brazilian Loose Wave Hair Lace Closure Free/Middle/Three Part Remy Human Hair 4x4 inche
145
Camera Detector Zum Thema passende Artikel
Think Green Collection: Editors Choice 10 Best Images
Where Can I Find Plastic Injection Molding Manufacturers in China?
How to Curl Your Hair Like This?
What Is Preventing the Solar Energy and Other Renewables to Replace the Oil?
What Matches This Formal Dress?
Aminica Wigs are dedicated to providing top quality virgin human hair for large trader and wholesaler. Usually they resell it to the retailers then to customers, which results in high price that makes customers awe-stricken.
Contact Us

+86 020-22139352

If you have a question, please contact at contact

Copyright © 2021 Aminica humain Wigs |Sitemap