Sunday, May 6, 2012

Non-Maximal Suppression


One problem faced when locating characters in an image is extraneous bounding boxes. When the segmenter is done detecting characters and placing bounding boxes for the image, there will be some false positives found, as well as overlapping bounding boxes.

One way to deal with this problem is Non Maximal Suppression (NMS). The idea behind NMS is to cluster bounding box groups, and then apply a heuristic to determine the best box in each cluster. I used kmeans to cluster the bounding boxes, and for each cluster, chose the bounding box closest to that cluster point.

Before NMS, on an input containing 7 characters, there were 21 bounding boxes found. After NMS, there were only 13 bounding boxes left, much closer to the real number of characters input. Here is a plot of the bounding boxes after NMS has been applied. Although it is hard to see a visual difference between the plots before and after NMS, you can see that the boxes left still contain characters.

Image with NMS applied



Goals:
1)   Become familiar with VLFeat
2)   Run cross validation, and test some more data to see how NMS performs with different expressions
3)   Try SIFT vs HOG and compare accuracy results.
4)   Fix the extraction method of data from the bounding boxes. Currently, the centroid of the character in the bounding box is found, and then extracted by carving out the pixels around it in the original image. A more useful approach would be to pad the information contained in the bounding box with whitespace, which will help when extracting characters that are nearby each other, such as exponents, or the range of an integral.

No comments:

Post a Comment