Object Detection Using OpenCV and Transfer Learning.
Object Detection Using OpenCV and Transfer Learning.
Environment setup/Project Requirements :
- Python
- OpenCV, Matplotlib libraries need to be installed.
- Jupyter Notebook/Pycharm/ Atom
Note: You cannot implement this project in Google-Colaboratory or in any python online Interpreter
Workflow:
- Importing Transfer Learning model
- Import dnn_DetecctionModel
- Looping through the coco dataset
- Initializing Threshold
- Testing the model on Image
- Looping through the Labels
- Implementing on Video
Step 1: Importing the Transfer learning model
Firstly, we initialize the transfer learning model weights and frozen graph t some variable names.
config_file = "ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt"
frozen_model = "frozen_inference_graph.pb"
Step 2: Importing dnn_DetectionModel
OpenCV contains dnn_DetectionModel, we can call it from cv2 by importing cv2.
Step 3: Looping through the coco dataset
Since the coco data set that we need to download will be in text format(.txt), we cannot use the "pandas" framework. So we have to use the "file handling" method in python to read the data and loop through it.
As we are using the transfer learning model, it has some predefined configurations such as input shape of the frame, etc. We initialize those parameters and convert a BGR image into the RGB image.
file_name = "coco.names"
with open(file_name,'rt') as fpt:
classLabels = fpt.read().rstrip('\n').split('\n')
model.setInputSize(320,320)
model.setInputScale(1/127.5)
Step 4:Initializing Threshold
In order to classify the objects in the image, we have to define a particular threshold. So that if the value is more than that particular threshold, then the object will be classified into that particular class.
In this case, I took a threshold of 0.5, if you want more accurate results you can increase that.
classIndex, confidence, bbox = model.detect(img,confThreshold=0.5)
Step 5: Testing the model on Image
We need to test the model on the image just to confirm that model was detected correctly and in order to apply the bounding boxes and classify the objects in the image.
Step 6: Looping through the Labels
After initializing the threshold in order to classify the image, we have to loop through the labels and compare the object with a particular label.
for ClassInd,conf,boxes in zip(classIndex.flatten(),confidence.flatten(), bbox):
cv2.rectangle(img,boxes,(225,0,0),2)
cv2.putText(img,classLabels[ClassInd-1],(boxes[0]+10,boxes[1]+40), font,
fontScale=font_scale, color=(0,255,0),thickness=3
Step 7: Implementing on the video
After successfully applying the process to the image, we can apply the same technique to the video file. When it comes to the real-world application the model has to be deployed or dumped in some hardware and it has to analyze the video, so we have to prefer the webcam rather than performing it on some pre-recorded video.
cap = cv2.VideoCapture(0)
if not cap.isOpened():
cap=cv2.VideoCapture(1)
if not cap.isOpened():
raise IOError("Cannot open webcam")
.
.
.
if cv2.waitKey(2) & 0xFF ==ord('q'):
break
You can access the complete code in my GitHub
Note: Thanks to every open source contributor.
Sources:
1)Multiple Real-time object identification using Single-shot Multi-Box detection, International Conference on Computational Intelligence in Data Science, IEEE
Expected Viva:
1)What are the applications of this project?
A) This project can be used in autonomous robots or in Object tracking systems and in some related fields, As autonomous vehicles use fleet-level training we cannot use this project for building autonomous vehicles
2)What is the transfer learning model used?
A)We have used the "Mobile-net SSD v3(versiuion3)" model as it was the recently released model, so I used that.
3)What can be the other transfer learning models that can be used?
A)There are many transfer learning models that are available in TensorFlow, you can use anyone but I prefer the latest one.
4)Why did you zip 3 files in for loop? (In Step 6)
A)We cannot loop through 3 files at a time if we use 3 for loops there will some time delay so that I zip the 3 files and I used for-loop only once.
5)What is the advantage of this project?
A)Usually, Yolo(you only look once) module/framework will be used for object detection projects as it was efficient in terms of speed but it's not compatible with all devices. When it comes to OpenCV it is compatible with most of the devices but it processes videos with some time delay.
made with 💓 by G. Sai Dheeraj
Comments
Post a Comment