A short yet smart introduction to, modern image processing algorithms.... Most of the algorithms listed here excepted the patented SIFT are already implemented in the open source, embedded, computer vision library SOD.
Thinning is the operation that seeks a connected region of pixels of a given property set to a small size. Other terms commonly used are "Shrinking", "Skeletonization" or "Medial Axis Transformation".
Skeletonization provides a simple, yet smart technique by discarding off as many pixels as possible of the target pattern while capturing its essential geometric features without affecting the general shape. That is, after application of the thinning process, the general shape of the object or pattern should still be recognizable. Skeletonization is useful when we are interested not in the size of the pattern but rather in the relative position of the strokes in the pattern. The aim of the skeletonization is to extract a region-based shape feature representing the general form of an object.
The skeleton of a binary object as shape descriptor has been proven to be effective for many applications fields. Popular applications of skeletonization include Optical Character Recognition, Medical Imagery, Pattern & Gesture Recognition, and so forth. The thinning process is always the first pass in modern Computer Vision applications and is heavily used here at PixLab for our Passports & ID Cards Scanning API Endpoints (Blog post announcement discussed here).
One of the most widely used algorithm for Skeletonization is the Hilditch's algorithm which given an input binary image of some pattern, should produce the following output:
Input Binary Image
Hilditch's Algorithm Output
Naccache and Shinghal presented an algorithm of Hilditch's for thinning edges. To summarize, the algorithm is applied by moving a 3 by 3 window over the image and a set of rules applied to the content of window. A pixel is marked for deletion if it satisfies all rules applied to it. This basic procedure is applied iteratively until no further points are deleted. The algorithm require a binary image as its input to operate. Otherwise, the result is undefined.
Hilditch's algorithm has been successfully implemented in SOD via the exported function sod_hilditch_thin_image(). The gist below highlight a typical usage of the Hilditch's algorithm to produce the image output displayed in the section above.
Image segmentation is an umbrella term that includes a set of techniques for image processing based on applying a dividing strategy (i.e. a single image is divided in multiple parts). After the dividing process, each one of the image components is used for a specific purpose, for example, to identify objects or other relevant information. Several methods are used for image segmentation: Thresholding, Color-based, Texture Filters, Clustering, among others. An effective approach to performing image segmentation includes using existing algorithms and tools, and integrating specific components for data analysis, image visualization, and the development and implementation of specific algorithms.
One of the most popular approaches for image segmentation is through thresholding. Thresholding takes a grayscale image and replaces each pixel with a black one if its intensity is less than some fixed constant, or a white pixel if the intensity is greater than that constant. The new binary image produced separates dark from bright regions. Mainly because finding pixels that share intensity in a region is not computationally expensive, thresholding is a simple and efficient method for image segmentation.
Several approaches have been proposed to define thresholding methods. According to the categorization defined by Sezgin and Sankur, six strategies were identified:
Histogram Shape Methods, which uses information from the image histogram.
Clustering Methods, which groups objects in classes, i.e. Background or Foreground.
Entropy Methods, which make use of entropy information for foreground and background, or cross-entropy between the original and the binary image.
Object Attribute Methods, which evaluate and use the similarities between the original and the binary image.
Spatial Methods, which apply higher-order probability distribution and/or correlation between pixels.
Local Methods, based on adapting the threshold value to locally defined characteristics.
Fixed Thresholding Output
The gist below showcase how to generate a binary image via fixed thresholding to produce the image output displayed above.
Put it simply, Otsu's method (named after Nobuyuki Otsu) is a popular thresholding algorithm for image segmentation, belonging to the clustering category, and is usually used for thresholding and binarization. Thresholding is used to extract an object from its background by assigning an intensity value T (threshold) for each pixel such that each pixel is either classified as foreground or background point.
Otsu's method works mainly with the image histogram, looking at the pixel values and the regions that the user wants to segment out, rather than looking at the edges of an image. It tries to segment the image making the variance on each of the classes minimal. The algorithm works well for images that contain two classes of pixels, following a bi-modal histogram distribution. The algorithm divides the image histogram into two classes, by using a threshold such as the in-class variability is very small. This way, each class will be as compact as possible. The spatial relationship between pixels is not taken into account, so regions that have similar pixel values but are in completely different locations in the image will be merged when computing the histogram, meaning that Otsu’s algorithm treats them as the same.
Otsu's Algorithm Output
Otsu's method has been successfully implemented in SOD via the exported function sod_otsu_binarize_image(). The gist below highlight a typical usage of the Otsu's algorithm to produce the image output displayed above.
Fingerprints are the oldest and most widely used form of biometric identification. Everyone is known to have mostly unique, immutable fingerprints. As most Automatic Fingerprint Recognition Systems are based on local ridge features known as minutiae, marking minutiae accurately and rejecting false ones is very important. However, fingerprint images get degraded and corrupted due to variations in skin and impression conditions. Thus, image enhancement techniques such as Hilditch Thinning, Thresholding, etc. are employed prior to minutiae extraction. A critical step in automatic fingerprint matching is to reliably extract minutiae from the input fingerprint image.
Fingerprints are the most widely used parameter for personal identification amongst all biometrics. Fingerprint identification is commonly employed in forensic science to aid criminal investigations etc. A fingerprint is a unique pattern of ridges and valleys on the surface of a finger of an individual. A ridge is defined as a single curved segment, and a valley is the region between two adjacent ridges. Minutiae points (fig. 1) are the local ridge discontinuities, which are of two types: ridge endings and bifurcations. A good quality image has around 40 to 100 minutiae. It is these minutiae points which are used for determining uniqueness of a fingerprint.
Input Grayscaled Fingerprint
Otsu's method has been successfully implemented in SOD via the exported function sod_minutiae(). The gist below extracts ridges and bifurcations from a fingerprint image using sod_minutiae().
S.U.R.F or **Speeded Up Robust Features **is a **patented **algorithm used mostly in computer vision tasks and tied to object detection purposes. SURF fall in the category of **feature descriptors** by extracting **keypoints **from different regions of a given image and thus is very useful in finding similarity between images: The algorithm works as follow: 1. Find features/keypoints that are likely to be found in different images of the same object. Those features should be scale and rotation invariant if possible. corners, blobs etc are good and most often searched in multiple scales. 2. Find the right "orientation" of that point so that if the image is rotated according to that orientation, both images are aligned in regard to that single keypoint. 3. Computation of a *descriptor *that has information of how the neighborhood of the keypoint looks like (after orientation) in the right scale. *source* (https://stackoverflow.com/questions/29417888/a-summary-of-how-surf-works) SURF is advertised to perform faster compared to previously proposed schemes like SIFT (https://en.wikipedia.org/wiki/Scale-invariant_feature_transform). This is achieved (as stated by its designers) by: * Relying on integral images for image convolutions. * Building on the strengths of the leading existing detectors and descriptors (using a Hessian matrix-based measure for the detector, and a distribution-based descriptor). * Simplifying these methods to the essential This leads to a combination of novel detection, description, and matching steps. At Pixlab (https://pixlab.io)where I currently work, we have used SURF in a previous version of the tagimg (https://pixlab.io/cmd?id=tagimg)API endpoint but it turns out that it perform poorly and thus was replaced by a Machine Learning technique for our image tagging (https://pixlab.io/cmd?id=tagimg) and NSFW (https://pixlab.io/cmd?id=nsfw)content detection endpoints.