Generally, an image contains more information than required for any particular task. For this reason, we need to preprocess the images so that they contain only as much information as required for the application, thereby reducing the computing time needed.
In this chapter, we will learn about the different preprocessing operations, which are as follows:
At the end of this chapter, we will see how you can integrate OpenCV into your existing Android applications.
Before we take a look at the various feature detection algorithms and their implementations, let's first build a basic Android application to which we will keep adding feature detection algorithms, as we go through this chapter.
When we see an image, we perceive it as colors and objects. However, a computer vision system sees it as a matrix of numbers (see the following image). These numbers are interpreted differently, depending on the color model used. The computer cannot directly detect patterns or objects in the image. The aim of computer vision systems is to interpret this matrix of numbers as an object of a particular type.
Representation of a binary image
OpenCV is the short form of Open Source Computer Vision library. It is the most widely used computer vision library. It is a collection of commonly used functions that perform operations related to computer vision. OpenCV has been natively written in C/C++, but has wrappers for Python, Java, and any JVM language, which is designed to create the Java byte code, such as Scala and Clojure. Since most of the Android app development is done in C++/Java, OpenCV has also been ported as an SDK that developers can use to implement it in their apps and make them vision enabled.
We will now take a look at how to get started with setting up OpenCV for the Android platform, and start our journey. We will use Android Studio as our IDE of choice, but any other IDE should work just as well with slight modifications. Follow these steps in order to get started:
OpenCV stores images as a custom object called Mat . This object stores the information such as rows, columns, data, and so on that can be used to uniquely identify and recreate the image when required. Different images contain different amounts of data. For example, a colored image contains more data than a grayscale version of the same image. This is because a colored image is a 3-channel image when using the RGB model, and a grayscale image is a 1-channel image. The following figures show how 1-channel and multichannel (here, RGB) images are stored (these images are taken from docs.opencv.org).
A 1-channel representation of an image is shown as follows:
A grayscale (1-channel) image representation:
A more elaborate form of an image is the RGB representation, which is shown as follows:
A RGB (3-channel) image representation
In the grayscale image, the numbers represent the intensity of that particular color. They are represented on a scale of 0-255 when using integer representations, with 0 being pure black and 255 being pure white. If we use a floating point representation, the pixels are represented on a scale of 0-1, with 0 being pure black and 1 being pure white. In an RGB image in OpenCV, the first channel corresponds to blue color, second channel corresponds to green color, and the third channel corresponds to red color. Thus, each channel represents the intensity of any particular color. As we know that red, green, and blue are primary colors, they can be combined in different proportions to generate any color visible to the human eye. The following figure shows the different colors and their respective RGB equivalents in an integer format:
Now that we have seen how an image is represented in computing terms, we will see how we can modify the pixel values so that they need less computation time when using them for the actual task at hand.
We all like sharp images. Who doesn't, right? However, there is a trade-off that needs to be made. More information means that the image will require more computation time to complete the same task as compared to an image which has less information. So, to solve this problem, we apply blurring operations.
Many of the linear filtering algorithms make use of an array of numbers called a kernel. A kernel can be thought of as a sliding window that passes over each pixel and calculates the output value for that pixel. This can be understood more clearly by taking a look at the following figure (this image of linear filtering/convolution is taken from http://test.virtual-labs.ac.in/labs/cse19/neigh/convolution.jpg):
In the preceding figure, a 3 x 3 kernel is used on a 10 x 10 image.
One of the most general operations used for linear filtering is convolution. The values in a kernel are coefficients for multiplication of the corresponding pixels. The final result is stored in the anchor point, generally, the center of the kernel:
Linear filtering operations are generally not in-place operations, as for each pixel we use the values present in the original image, and not the modified values.
One of the most common uses of linear filtering is to remove the noise. Noise is the random variation in brightness or color information in images. We use blurring operations to reduce the noise in images.
A mean filter is the simplest form of blurring. It calculates the mean of all the pixels that the given kernel superimposes. The kernel that is used for this kind of operation is a simple Mat that has all its values as 1, that is, each neighboring pixel is given the same weightage.
For this chapter, we will pick an image from the gallery and apply the respective image transformations. For this, we will add basic code. We are assuming that OpenCV4Android SDK has been set up and is running.
We can use the first OpenCV app that we created at the start of the chapter for the purpose of this chapter. At the time of creating the project, the default names will be as shown in the following screenshot:
Add a new activity by right-clicking on the Java folder and navigate to New | Activity . Then, select Blank Activity . Name the activity MainActivity.java and the XML file activity_main.xml . Go to res/menu/menu_main.xml . Add an item as follows:
Since MainActivity is the activity that we will be using to perform our OpenCV specific tasks, we need to instantiate OpenCV. Add this as a global member of MainActivity.java :
private BaseLoaderCallback mOpenCVCallBack = new BaseLoaderCallback(this) < @Override public void onManagerConnected(int status) < switch (status) < case LoaderCallbackInterface.SUCCESS: //DO YOUR WORK/STUFF HERE break; default: super.onManagerConnected(status); break; >> >; @Override protected void onResume()This is a callback, which checks whether the OpenCV manager is installed. We need the OpenCV manager app to be installed on the device because it has all of the OpenCV functions defined. If we do not wish to use the OpenCV manager, we can have the functions present natively, but the APK size then increases significantly. If the OpenCV manager is not present, the app redirects the user to the Play Store to download it. The function call in onResume loads OpenCV for use.
Next we will add a button to activity_home.xml :
Then, in HomeActivity.java , we will instantiate this button, and set an onClickListener to this button:
Button bMean = (Button)findViewById(R.id.bMean); bMean.setOnClickListener(new View.OnClickListener() < @Override public void onClick(View v) < Intent i = new Intent(getApplicationContext(), MainActivity.class); i.putExtra("ACTION_MODE", MEAN_BLUR); startActivity(i); >>);
Downloading the example code
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
In the preceding code, MEAN_BLUR is a constant with value 1 that specifies the type of operation that we want to perform.
Here we have added extra to the activity bundle. This is to differentiate which operation we will be performing.
Open activity_main.xml . Replace everything with this code snippet. This snippet adds two ImageView items: one for the original image and one for the processed image:
We need to programmatically link these ImageView items to the ImageView items in Java in our MainActivity.java :
private final int SELECT_PHOTO = 1; private ImageView ivImage, ivImageProcessed; Mat src; static int ACTION_MODE = 0; @Override protected void onCreate(Bundle savedInstanceState) < // Android specific code ivImage = (ImageView)findViewById(R.id.ivImage); ivImageProcessed = (ImageView)findViewById(R.id.ivImageProcessed); Intent intent = getIntent(); if(intent.hasExtra("ACTION_MODE"))Here, the Mat and ImageViews have been made global to the class so that we can use them in other functions, without passing them as parameters. We will use the ACTION_MODE variable to identify the required operation to be performed.
Now we will add the code to load an image from the gallery. For this, we will use the menu button we created earlier. We will load the menu_main.xml file, when you click on the menu button:
@Override public boolean onCreateOptionsMenu(Menu menu)Then we will add the listener that will perform the desired action when an action item is selected. We will use Intent.ACTION_PICK to get an image from the gallery:
@Override public boolean onOptionsItemSelected(MenuItem item) < int if (id == R.id.action_load_image) < Intent photoPickerIntent = new Intent(Intent.ACTION_PICK); photoPickerIntent.setType("image/*"); startActivityForResult(photoPickerIntent, SELECT_PHOTO); return true; >return super.onOptionsItemSelected(item); >
As you can see, we have used startActivityForResult() . This will send the selected image to onActivityResult() . We will use this to get the Bitmap and convert it to an OpenCV Mat. Once the operation is complete, we want to get the image back from the other activity. For this, we make a new function onActivityResult() that gets called when the activity has completed its work, and is returned to the calling activity. Add the following code to onActivityResult() :
switch(requestCode) < case SELECT_PHOTO: if(resultCode == RESULT_OK)< try < //Code to load image into a Bitmap and convert it to a Mat for processing. final Uri imageUri = imageReturnedIntent.getData(); final InputStream imageStream = getContentResolver().openInputStream(imageUri); final Bitmap selectedImage = BitmapFactory.decodeStream(imageStream); src = new Mat(selectedImage.getHeight(), selectedImage.getWidth(), CvType.CV_8UC4); Utils.bitmapToMat(selectedImage, src); switch (ACTION_MODE)< //Add different cases here depending on the required operation >//Code to convert Mat to Bitmap to load in an ImageView. Also load original image in imageView > catch (FileNotFoundException e) < e.printStackTrace(); >> break; >
To apply mean blur to an image, we use the OpenCV provided function blur() . We have used a 3 x 3 kernel for this purpose:
case HomeActivity.MEAN_BLUR: Imgproc.blur(src, src, new Size(3,3)); break;
Now we will set this image in an ImageView to see the results of the operation:
Bitmap processedImage = Bitmap.createBitmap(src.cols(), src.rows(), Bitmap.Config.ARGB_8888); Utils.matToBitmap(src, processedImage); ivImage.setImageBitmap(selectedImage); ivImageProcessed.setImageBitmap(processedImage);
Original Image (Left) and Image after applying Mean Blur (Right)
The Gaussian blur is the most commonly used method of blurring. The Gaussian kernel is obtained using the Gaussian function given as follows:
The Gaussian Function in one and two dimensions
The anchor pixel is considered to be at (0, 0). As we can see, the pixels closer to the anchor pixel are given a higher weightage than those further away from it. This is generally the ideal scenario, as the nearby pixels should influence the result of a particular pixel more than those further away. The Gaussian kernels of size 3, 5, and 7 are shown in the following figure (image of 'Gaussian kernels' taken from http://www1.adept.com/main/KE/DATA/ACE/AdeptSight_User/ImageProcessing_Operations.html):
These are the Gaussian kernels of size 3 x 3, 5 x 5 and 7 x 7.
To use the Gaussian blur in your application, OpenCV provides a built-in function called GaussianBlur . We will use this and get the following resulting image. We will add a new case to the same switch block we used earlier. For this code, declare a constant GAUSSIAN_BLUR with value 2:
case HomeActivity.GAUSSIAN_BLUR: Imgproc.GaussianBlur(src, src, new Size(3,3), 0); break;
Image after applying Gaussian blur on the original image
One of the common types of noise present in images is called salt-and-pepper noise. In this kind of noise, sparsely occurring black and white pixels are distributed over the image. To remove this type of noise, we use median blur. In this kind of blur, we arrange the pixels covered by our kernel in ascending/descending order, and set the value of the middle element as the final value of the anchor pixel. The advantage of using this type of filtering is that salt-and-pepper noise is sparsely occurring, and so its influence is only over a small number of pixels when averaging their values. Thus, over a bigger area, the number of noise pixels is fewer than the number of pixels that are useful, as shown in the following image:
Example of salt-and-pepper noise
To apply median blur in OpenCV, we use the built-in function medianBlur . As in the previous cases, we have to add a button and add the OnClickListener functions. We will add another case condition for this operation:
case HomeActivity.MEDIAN_BLUR: Imgproc.medianBlur(src, src, 3); break;
Resulting image after applying median blur
Median blur does not use convolution.
We have seen how different types of kernels affect the image. What if we want to create our own kernels for different applications that aren't natively offered by OpenCV? In this section, we will see how we can achieve just that. We will try to form a sharper image from a given input.
Sharpening can be thought of as a linear filtering operation where the anchor pixel has a high weightage and the surrounding pixels have a low weightage. A kernel satisfying this constraint is shown in the following table:
We will use this kernel to perform the convolution on our image:
case HomeActivity.SHARPEN: Mat kernel = new Mat(3,3,CvType.CV_16SC1); kernel.put(0, 0, 0, -1, 0, -1, 5, -1, 0, -1, 0);
Here we have given the image depth as 16SC1 . This means that each pixel in our image contains a 16-bit signed integer (16S) and the image has 1 channel (C1).
Now we will use the filter2D() function, which performs the actual convolution when given the input image and a kernel. We will show the image in an ImageView. We will add another case to the switch block created earlier:
Imgproc.filter2D(src, src, src.depth(), kernel);
Original image (left) and sharpened image (right)
Morphological operations are a set of operations that process an image based on the features of the image and a structuring element. These generally work on binary or grayscale images. We will take a look at some basic morphological operations before moving on to more advance ones.
Dilation is a method by which the bright regions of an image are expanded. To achieve this, we take a kernel of the desired size and replace the anchor pixel with the maximum value overlapped by the kernel. Dilation can be used to merge objects that might have been broken off.
A binary image (left) and the result after applying dilation (right)
To apply this operation, we use the dilate() function. We need to use a kernel to perform dilation. We use the getStructuringElement() OpenCV function to get the required kernel.
OpenCV provides MORPH_RECT , MORPH_CROSS , and MORPH_ELLIPSE as options to create our required kernels:
case HomeActivity.DILATE: Mat kernelDilate = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(3, 3)); Imgproc.dilate(src, src, kernelDilate); break;
Original image (left) and dilated image (right)
If we use a rectangular structuring element, the image grows in the shape of a rectangle. Similarly, if we use an elliptical structuring element, the image grows in the shape of an ellipse.
Similarly, erosion is a method by which the dark regions of an image are expanded. To achieve this, we take a kernel of the desired size and replace the anchor pixel by the minimum value overlapped by the kernel. Erosion can be used to remove the noise from images.
A binary image (left) and the result after applying erosion (right)
To apply this operation, we use the erode() function:
case HomeActivity.ERODE: Mat kernelErode = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(5, 5)); Imgproc.erode(src, src, kernelErode); break;
Original image (left) and eroded image (right)
Erosion and dilation are not inverse operations.
Thresholding is the method of segmenting out sections of an image that we would like to analyze. The value of each pixel is compared to a predefined threshold value and based on this result, we modify the value of the pixel. OpenCV provides five types of thresholding operations.
To perform thresholding, we will use the following code as a template and change the parameters as per the kind of thresholding required. We need to replace THRESH_CONSTANT with the constant for the required method of thresholding:
case HomeActivity.THRESHOLD: Imgproc.threshold(src, src, 100, 255, Imgproc.THRESH_CONSTANT); break;
Here, 100 is the threshold value and 255 is the maximum value (the value of pure white).
The constants are listed in the following table:
Thresholding Method Name
Threshold to zero
Binary threshold, inverted
Threshold to zero, inverted
Setting a global threshold value may not be the best option when performing segmentation. Lighting conditions affect the intensity of pixels. So, to overcome this limitation, we will try to calculate the threshold value for any pixel based on its neighboring pixels.
We will use three parameters to calculate the adaptive threshold of an image:
case HomeActivity.ADAPTIVE_THRESHOLD: Imgproc.cvtColor(src, src, Imgproc.COLOR_BGR2GRAY); Imgproc.adaptiveThreshold(src, src, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, 3, 0); break;Original image (left) and image after applying Adaptive thresholding (right)
Here, the resulting image has a lot of noise present. This can be avoided by applying a blurring operation before applying adaptive thresholding, so as to smooth the image.
In this chapter, we have learnt how to get started with using OpenCV in your Android project. Then we looked at different filters in image processing, especially linear filters, and how they can be implemented on an Android device. These filters will later form the basis of any computer vision application that you try to build. In the following chapters, we will look at more complex image filters, and also see how to extract information from the images in the form of edges, corners, and the like.
Left arrow icon4. Drilling Deeper into Object Detection – Using Cascade Classifiers Chevron down icon Chevron up icon
Drilling Deeper into Object Detection – Using Cascade Classifiers An introduction to cascade classifiers Face detection using the cascade classifier HOG descriptors Project – Happy Camera 5. Tracking Objects in Videos Chevron down icon Chevron up icon Tracking Objects in Videos Optical flow Image pyramids Basic 2D transformations Global motion estimation The Kanade-Lucas-Tomasi tracker 6. Working with Image Alignment and Stitching Chevron down icon Chevron up icon Working with Image Alignment and Stitching Image stitching 7. Bringing Your Apps to Life with OpenCV Machine Learning Chevron down icon Chevron up icon Bringing Your Apps to Life with OpenCV Machine Learning Optical Character Recognition Solving a Sudoku puzzle 8. Troubleshooting and Best Practices Chevron down icon Chevron up icon Troubleshooting and Best Practices Troubleshooting errors Best practices 9. Developing a Document Scanning App Chevron down icon Chevron up icon Developing a Document Scanning App Let's begin The algorithm Implementing on Android Index Chevron down icon Chevron up iconWhere there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.
If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.
Please Note: Packt eBooks are non-returnable and non-refundable.
Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:
If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:
Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.
You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.
What are the benefits of eBooks? Chevron down icon Chevron up iconPackt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.
When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.
For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.