Showcasing My Object Detection and Counting Project
The link to this project's GitHub page can be found here.
I've been working on several projects to enhance my machine learning and computer vision skills, and one project that particularly stands out is my idea for an object detection model capable of not only identifying objects in an image but also calculating the total count and value of those objects. This approach combines traditional object detection with a unique functionality that adds real-world utility to the model, such as counting and summarizing values.
Exploration of YOLOv8 in PyTorch
For this project, I decided to experiment with the YOLOv8 (You Only Look Once, version 8) architecture implemented in PyTorch. While I have extensive experience using TensorFlow for deep learning tasks, I saw this as an opportunity to step out of my comfort zone and expand my skills by learning PyTorch. This decision allowed me to gain hands-on experience with a different neural network framework, offering me insight into its flexibility, debugging capabilities, and diverse ecosystem of tools.
The Core Idea
- Detect objects in an image: Training a robust object detection model to identify specific objects accurately.
- Count the detected objects: Parsing the model's output to compute totals for each detected object.
- Calculate the value sum: Assigning predefined values to objects and summing them up based on detection counts.
- Overlay results: Displaying counts, values, and bounding boxes directly on the image for visual clarity.
Implementation Examples
I trained and evaluated this idea using two datasets:
Dice Value Detection
The first implementation involved detecting and summing the values shown on a set of dice in an image. After annotating a dataset of dice images with bounding boxes and corresponding values, I trained the YOLOv8 model to recognize each die's value (1 through 6). The resulting script outputs the detected dice values, their counts, and the total sum of all dice values.
Canadian Coin Detection
For the second implementation, I created a dataset of Canadian coins, including pennies, nickels, dimes, quarters, loonies, and toonies. The model detects each coin type, counts the number of coins, and calculates their total monetary value in Canadian dollars. For example, if the model detects three quarters and two loonies, it outputs a total of 2.75 CAD.
Challenges and Learning Opportunities
- Dataset Preparation: Annotating images for object detection required careful labeling to ensure accuracy.
- Framework Transition: Shifting from TensorFlow to PyTorch helped me understand dynamic computation graphs.
- Post-Processing: Integrating detection results with logic for counting and value calculation.
- Result Visualization: Overlaying detection results clearly using libraries like OpenCV or Matplotlib.
Applications and Next Steps
This project has broad real-world applications, including:
- Retail and Inventory Management: Automatically counting and valuing items for stocktaking.
- Gaming: Automating score calculations from game components like dice or cards.
- Currency Handling: Rapid identification and valuation of coins or banknotes.
Next steps include deploying the model on edge devices, expanding datasets, and exploring multi-object detection scenarios.
Conclusion
This project combines advanced object detection techniques with real-world functionality, demonstrating how AI can simplify complex tasks while providing tangible results. I look forward to refining this model further and applying it to new challenges!