← Back to Projects ML/AI Projects

ShapeSense – Real-Time 2D Object Recognition

Built a real-time 2D object recognition system in OpenCV that segments live video, computes custom shape descriptors, and classifies objects like mugs, gloves, watches, and power banks while allowing users to register new categories on the fly.

View project report

Quick Insights:

Introduction

ShapeSense is a real-time 2D object recognition system built as part of a computer vision course. The goal was to design a full pipeline from raw video frames to predictions using classic vision techniques rather than end-to-end deep networks.

The system focuses on dark objects on a light background (e.g., mug, glove, passport, watch) placed on a workspace. From each frame, we threshold, clean, segment and then compute custom features that are invariant to translation, scale and in-plane rotation. These features are stored in a CSV database and used for nearest-neighbour classification in real time.

Pipeline & Feature Design

The full pipeline is implemented in C++/OpenCV and runs per frame on live video. Each step is designed to make the final feature vector robust to translation, scaling and rotation.

Representative Code Snippet

// Extract major region and compute moments
Moments mu = moments(regionMask, true);

// Compute Hu moments (rotation/scale invariant)
double hu[7];
HuMoments(mu, hu);

// Compute oriented bounding box and fill ratio
RotatedRect box = minAreaRect(regionPoints);
float boxRatio = box.size.height / box.size.width;
float fillPercent = contourArea(regionPoints) /
                    (box.size.height * box.size.width);

// Build 7D feature vector for this object
std::vector<double> features = {
    hu[0], hu[1], hu[2], hu[3], hu[4],
    boxRatio, fillPercent
};

Classification & Evaluation

With the feature vectors in place, recognition is done with simple, explainable methods and evaluated via confusion matrices.

This project deepened my understanding of how a classical 2D object recognition system can be engineered end-to-end, and how it compares in practice to DNN-based approaches on small, controlled datasets.