Sign language detection with Python and Scikit Learn | Landmark detection | Computer vision tutorial
Table of Contents
Introduction
In this tutorial, we will learn how to detect hand signs using Python along with key libraries such as Mediapipe, OpenCV, and Scikit-Learn. This project is an excellent introduction to computer vision and machine learning, particularly in the area of sign language detection. By the end of this guide, you'll have a working model to recognize specific hand signs.
Step 1: Data Collection
Before creating a model, we need to gather data.
- Use a camera to capture images of different hand signs.
- Ensure to include a variety of signs and angles to improve model accuracy.
- Aim for at least 1000 images for each sign to create a robust dataset.
Tip: Save images in separate folders named after the corresponding sign for easy categorization.
Step 2: Data Processing
Once you have collected your data, it's essential to process it before feeding it into the model.
- Use OpenCV to read images from your dataset.
- Resize images to a uniform size (e.g., 224x224 pixels) for consistency.
Here’s a sample code snippet to read and resize images:
import cv2
import os
def load_and_preprocess_images(folder_path)
images = []
labels = []
for label in os.listdir(folder_path)
for image_name in os.listdir(os.path.join(folder_path, label))
image_path = os.path.join(folder_path, label, image_name)
image = cv2.imread(image_path)
image = cv2.resize(image, (224, 224))
images.append(image)
labels.append(label)
return images, labels
- Normalize pixel values to a range between 0 and 1 to enhance model training.
Step 3: Landmark Detection
Utilize Mediapipe for landmark detection to improve the model's ability to identify hand signs.
- Import the Mediapipe library and initialize the hands module.
- Process the images to detect hand landmarks, which will act as features for the model.
Here’s an example of how to detect landmarks:
import mediapipe as mp
mp_hands = mp.solutions.hands
hands = mp_hands.Hands()
def detect_landmarks(image)
results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
if results.multi_hand_landmarks
for hand_landmarks in results.multi_hand_landmarks
# Extract landmarks for each hand
pass
return landmarks
Step 4: Train Model
With processed data and landmarks, it’s time to train your model.
- Split your dataset into training and testing sets (e.g., 80% train, 20% test).
- Use Scikit-Learn to create a classifier, such as a Random Forest or SVM.
Sample training code:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2)
model = RandomForestClassifier()
model.fit(X_train, y_train)
Step 5: Test Model
After training, you can evaluate your model's performance.
- Use the testing set to check the model's accuracy.
- Display the confusion matrix to understand misclassifications.
Here is how you can evaluate your model:
from sklearn.metrics import accuracy_score, confusion_matrix
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
Conclusion
In this tutorial, we've covered the essential steps needed to detect hand signs using Python and various libraries. You learned how to collect and process data, detect landmarks, train a model, and evaluate its performance.
Next steps could involve:
- Fine-tuning your model for better accuracy.
- Exploring deep learning models for potentially improved results.
- Implementing real-time sign language detection using a webcam.
For the full source code and additional details, you can check the project repository here.