OpenCV Python – Digit Recognition with KNN
Digit recognition is one of the most popular machine learning applications in computer vision. It involves identifying handwritten or printed digits from images.
OpenCV provides a built-in implementation of the K-Nearest Neighbors (KNN) algorithm, making it easy to build a digit recognition system.
In this tutorial, you'll learn how to recognize handwritten digits using OpenCV Python and KNN.
1. What is Digit Recognition?
Digit recognition is the process of automatically identifying numerical digits such as:
0 1 2 3 4 5 6 7 8 9
Applications include:
- Postal code recognition
- Bank cheque processing
- OCR systems
- Document digitization
- Number plate recognition
- Educational software
2. What is KNN?
K-Nearest Neighbors (KNN) is a supervised machine learning algorithm.
It classifies data by:
- Finding the nearest training samples
- Looking at their labels
- Predicting the most common label
Example:
Neighbors:
3, 3, 3, 8, 3
Prediction:
3
3. Import Required Libraries
import cv2
import numpy as np
4. Understanding the Dataset
OpenCV provides a sample digit dataset containing handwritten digits.
The dataset image contains:
5000 digits
50 rows
100 columns
Digits 0–9
Each digit is represented by:
20 × 20 pixels
5. Load the Digits Dataset
img = cv2.imread('digits.png', 0)
cv2.imshow("Digits Dataset", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
6. Split Dataset into Individual Digits
cells = [np.hsplit(row,100) for row in np.vsplit(img,50)]
x = np.array(cells)
Output shape:
print(x.shape)
Output:
(50, 100, 20, 20)
7. Prepare Training and Testing Data
Convert images into feature vectors.
train = x[:,:50].reshape(-1,400).astype(np.float32)
test = x[:,50:100].reshape(-1,400).astype(np.float32)
Explanation:
20 × 20 = 400 pixels
Each digit becomes:
[400 features]
8. Create Labels
Generate labels for digits 0–9.
k = np.arange(10)
train_labels = np.repeat(k,250)[:,np.newaxis]
test_labels = np.repeat(k,250)[:,np.newaxis]
9. Create KNN Model
OpenCV provides the KNN classifier.
knn = cv2.ml.KNearest_create()
10. Train the KNN Model
knn.train(
train,
cv2.ml.ROW_SAMPLE,
train_labels
)
Training is now complete.
11. Test the Model
Use K = 5 neighbors.
ret, result, neighbours, dist = knn.findNearest(
test,
k=5
)
12. Calculate Accuracy
matches = result == test_labels
correct = np.count_nonzero(matches)
accuracy = correct * 100.0 / result.size
print("Accuracy:", accuracy)
Example Output:
Accuracy: 91.2
Accuracy may vary depending on the dataset.
13. Recognize a New Digit
Load a new handwritten digit image.
digit = cv2.imread("digit.png", 0)
digit = cv2.resize(digit, (20,20))
sample = digit.reshape(-1,400).astype(np.float32)
Predict:
ret, result, neighbours, dist = knn.findNearest(
sample,
k=5
)
print("Predicted Digit:", int(result[0][0]))
Output:
Predicted Digit: 7
14. Understanding K Value
The K value determines how many neighbors are considered.
Example:
| K Value | Behavior |
|---|---|
| 1 | Fast but sensitive to noise |
| 3 | Common choice |
| 5 | Balanced |
| 7+ | More stable but slower |
Typical choice:
k = 3 or 5
15. Advantages of KNN
Simple to Implement
No complex training process.
Good Accuracy
Works well on small datasets.
Easy to Understand
Ideal for beginners learning machine learning.
Flexible
Can classify various types of data.
16. Limitations of KNN
Slow on Large Datasets
Every prediction requires distance calculations.
Memory Intensive
Stores all training samples.
Sensitive to Noise
Incorrect samples may affect predictions.
17. Real-World Applications
Digit recognition with KNN is used in:
- OCR systems
- Bank cheque readers
- Handwritten form processing
- Educational apps
- Automated data entry
- Postal sorting systems
18. Complete Example
import cv2
import numpy as np
img = cv2.imread('digits.png',0)
cells = [np.hsplit(row,100) for row in np.vsplit(img,50)]
x = np.array(cells)
train = x[:,:50].reshape(-1,400).astype(np.float32)
test = x[:,50:100].reshape(-1,400).astype(np.float32)
k = np.arange(10)
train_labels = np.repeat(k,250)[:,np.newaxis]
test_labels = np.repeat(k,250)[:,np.newaxis]
knn = cv2.ml.KNearest_create()
knn.train(
train,
cv2.ml.ROW_SAMPLE,
train_labels
)
ret, result, neighbours, dist = knn.findNearest(
test,
k=5
)
matches = result == test_labels
correct = np.count_nonzero(matches)
accuracy = correct * 100.0 / result.size
print("Accuracy:", accuracy)
19. Best Practices
- Normalize image sizes before training
- Use grayscale images
- Choose appropriate K values
- Remove noisy samples
- Evaluate accuracy using test data
- Use larger datasets for better performance
Conclusion
Digit Recognition with KNN in OpenCV Python is an excellent introduction to machine learning and computer vision. By combining image preprocessing with the K-Nearest Neighbors algorithm, you can build systems capable of recognizing handwritten digits with impressive accuracy.
After mastering KNN digit recognition, you can explore more advanced techniques such as Support Vector Machines (SVM), Artificial Neural Networks (ANN), and Deep Learning models like CNNs for even higher accuracy.


0 Comments