Real-Time Hand Gesture Recognition with AI ๐คโ
๐ Project Background
Hand gestures can serve as intuitive and efficient remote commands for controlling various devices and systems, especially in environments where physical touch is impractical or impossible. By developing accurate gesture recognition models, this project aims to enhance accessibility and efficiency in controlling devices such as smart TVs, virtual reality interfaces, or industrial machinery. Solving this problem could help revolutionize user interfaces, making technology more accessible to individuals with disabilities and improving user experience across various interactive platforms.
โฟป Description
Using TensorFlow for model inference, OpenCV for image preprocessing, and ngrok for public URL access, this application allows users to upload or capture hand gestures, which are classified into predefined categories like 'Dislike', 'Like', 'Mute', 'OK', and 'Stop'. The project demonstrates the integration of AI, web technologies, and real-time image processing for gesture-based applications.
๐ฅ Demo
Check out the demo of HandGesture in action below! [Demo Video Link]
In the demo, you can see the model accurately classifying hand gestures in real-time.
โจ Features
๐ Real-time hand gesture recognition
๐ผ๏ธ Image upload and webcam capture support
๐ Integrated ngrok for public URL access
๐ฅ๏ธ Flask-based web interface
๐ฏ High accuracy with a pre-trained TensorFlow model
๐ Displays predictions with labeled images
โ๏ธ Installation
Follow the steps below to set up and run the project on your local machine.
๐ฆ Prerequisites
Ensure you have Python installed (>= 3.8).
Install the necessary libraries by running:
pip install Flask tensorflow pyngrok flasgger flask-cors opencv-python
Download the HaGrid dataset: https://www.kaggle.com/datasets/innominate817/hagrid-classification-512p
๐ ๏ธ Cloning the Repository
Clone the repository to your local machine:
git clone https://github.com/Emeron16/UCSD.git
cd HandGesture
โถ๏ธ Running the Project
Once installed, you can run the project in two phases:
๐ ๏ธ Phase 1: Model Training and Setup
In this phase, you will train or load the pre-trained TensorFlow model, preprocess the data, and ensure it works as expected.
Step 1: Data Preparation
- Organize your gesture images into separate folders corresponding to each gesture class ('Dislike', 'Like', 'Mute', 'OK', 'Stop').
Step 2: Model Training or Loading
If you have the pre-trained model, ensure it's in the project folder as
HagridModel1.keras
.Otherwise, train the model on the prepared dataset using MobileNet for feature extraction and CNN layers for classification.
Step 3: Saving the Model
After training, save the model as:
model.save('HagridModel1.keras')
๐ Phase 2: Deploying the Model
Once the model is trained and saved, the deployment process can begin using the Flask web application.
Step 1: Set Up the Environment
Ensure that your project directory is structured as follows:
HandGestureRecognition/ โโโ templates/ โ โโโ handgestureIndex.html # HTML template for uploading and viewing results โโโ FlaskDeploymentHandGesture.py # Flask app for model deployment โโโ HagridModel1.keras # Pre-trained model โโโ static/ # Optional: static files (e.g., CSS, JavaScript)
Step 2: Run the Flask Application
Start the Flask app with:
python FlaskDeploymentHandGesture.py
This will start the local server and open an ngrok tunnel, providing you with a public URL to access the app.
Step 3: Access the Application
After running the script, a public URL will be generated by ngrok and printed in the terminal:
* ngrok tunnel "http://<ngrok_url>" -> "http://127.0.0.1:5001"
Open this URL in your browser to interact with the web interface for uploading images or capturing gestures through your webcam.
๐ Usage
๐ธ Image Upload or Webcam Capture
Upload: Use the file upload feature to submit an image for prediction.
Webcam Capture: Capture real-time gestures via webcam and see the model's predictions.
๐ง How It Works
Preprocessing: Images are preprocessed by resizing them to 224x224 and applying MobileNet's preprocessing function.
Model Inference: The pre-trained model classifies the image into one of the five categories.
Visualization: The prediction is displayed along with the processed image on the web interface.
๐ฎ Future Improvements
๐งฉ Expand gesture classes: Add more hand gesture types for broader recognition.
๐ฅ Real-time video processing: Extend the application to process continuous hand gestures in real-time video feeds.
๐ฅ๏ธ Enhanced UI: Improve the interface for a more seamless user experience.
Subscribe to my newsletter
Read articles from Emeron Marcelle directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Emeron Marcelle
Emeron Marcelle
As a doctoral scholar in Information Technology, I am deeply immersed in the world of artificial intelligence, with a specific focus on advancing the field. Fueled by a strong passion for Machine Learning and Artificial Intelligence, I am dedicated to acquiring the skills necessary to drive growth and innovation in this dynamic field. With a commitment to continuous learning and a desire to contribute innovative ideas, I am on a path to make meaningful contributions to the ever-evolving landscape of Machine Learning.