In the last post about OpenCV, we demonstrated the library’s power by solving the problem of detecting road lines. While that was an impressive example, we’ll now expand our knowledge by exploring other fundamental functions useful in many computer vision applications. We’ll do this by developing an application that will be able to count the money we have in some of our leftover coins.
Like last time we should start by reading our image, resizing it, and converting it to grayscale for ease of further manipulation. For demonstration purposes we’ll start with a perfect scan of a single coin.
import cv2 import numpy as np img = cv2.imread('opencv_2/photos/cent.jpg') c_img = cv2.resize(img, (0, 0), fx=0.4, fy=0.4) img = cv2.cvtColor(c_img, cv2.COLOR_BGR2GRAY) |
As with most computer vision problems, we’ll want to start our main processing with a filter. Last time, we described Gaussian and bilateral filters, so this time we can discuss the median filter. It is a non-linear filter (unlike the Gaussian filter) that replaces each pixel’s value with the median value of its neighborhood (generally within a window we specify beforehand). It’s very effective at preserving edges but can be computationally demanding, especially with larger kernels for median sorting operations. In practice, it excels at handling “salt-and-pepper” noise, which involves random occurrences of white and black pixels within the image. While it may not seem particularly useful for our scan of a coin, we’ll find its value in later stages. As usual, its usage in OpenCV is quite straightforward.
m_blur = cv2.medianBlur(src=img, ksize=5) |
Now, we can try to retrieve a contour from our coin. The easiest way to approach this problem is to use thresholding on our image. Because we converted the image to grayscale, thresholding turns off pixels below our threshold (setting their value to 0) and sets the others to a chosen value (usually 255). As before, it might be useful to use OpenCV’s track bars to fine-tune the threshold value for each specific use case of this method.
def trackbar_callback(value): print(f'Current value: {value}') pass cv2.namedWindow('test_threshold') cv2.createTrackbar('threshold', 'test_threshold', 0, 255, trackbar_callback) while True: key_code = cv2.waitKey(10) if key_code == 27: # escape key break threshold = cv2.getTrackbarPos('threshold', 'test_threshold') _, test_threshold = cv2.threshold(m_blur, threshold, 255, cv2.THRESH_BINARY) cv2.imshow('test_threshold', test_threshold) cv2.destroyAllWindows() |
Image source: https://pixabay.com/photos/cent-50-euro-coin-currency-europe-400248/
Now with a chosen threshold value, we can utilize the findContours function. This function takes a binary image (like the one we created, but with the values inverted so that our object is bright) and connects continuous points with the same color or intensity along the boundary of an object. What makes it particularly useful for more complex shapes is the ability to provide information about the hierarchy of the contours. Additionally, contours themselves can be crucial for shape and object detection in computer vision tasks because they are scale and orientation-independent. This enables us to retrieve distinct inner contours or, in our case, identify the outermost contour and draw it on the original image.
_, thresh = cv2.threshold(m_blur, 220, 255, cv2.THRESH_BINARY) negative_thresh = abs(255-thresh) contours, hierarchy = cv2.findContours(negative_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cv2.drawContours(c_img, contours, -1, (0, 255, 0), 2) |
While the methods used in this approach can be highly effective in controlled environments, their usefulness diminishes in less ideal conditions, such as with a phone image we’ll use from now on. Instead of binary thresholding, we could use a fine-tuned Canny algorithm similar to the line detection example. This would allow us to use contour information to manually determine which shapes are close to being circles using some clever maths. Fortunately, OpenCV provides a solution that addresses these challenges with a single function called HoughCircles. This function simplifies the process and enhances accuracy, making it ideal for more variable conditions.
circles = cv2.HoughCircles(m_blur, cv2.HOUGH_GRADIENT, 1, minDist=90, param1=50, param2=75, minRadius=50, maxRadius=200) |
As a first step, the method uses Canny edge detection anyway, which can be fine-tuned with param1 and param2 as the thresholding with hysteresis values. Next, it employs an extension of the Hough Transform that we used for line detection. In this case, it uses the equation (x−a)² + (y−b)² = r² to represent circles in the parameter space. The minDist parameter describes the minimum distance between two circles, and minRadius and maxRadius define their respective minimum and maximum radius lengths. These parameters are crucial and should be fine-tuned for each application.
With the detected circle’s information, we can ensure their coordinates are integers and then check their radius to determine the coin’s value. Finally, we sum up our savings and visualize the results on the initial color image.
circles = np.uint16(np.around(circles)) savings = 0 for i in circles[0, :]: if i[2] > 96: colour = (0, 0, 255) savings += 5 cv2.circle(c_img, (i[0], i[1]), i[2], colour, 2) elif i[2] > 88: colour = (0, 255, 0) savings += 2 cv2.circle(c_img, (i[0], i[1]), i[2], colour, 2) elif i[2] > 80: colour = (255, 0, 0) savings += 1 cv2.circle(c_img, (i[0], i[1]), i[2], colour, 2) cv2.circle(c_img, (i[0], i[1]), 2, colour, 3) print("Total amount: " + str(savings) + " grosz") cv2.putText(c_img, "Total amount: " + str(savings), (10, 875), fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1, colour=(0, 0, 0), thickness=2) cv2.imshow('circles', c_img) |