Computer Vision
Computer Vision is an area of AI that deals with visual processing. Let's explore some of the possibilities that computer vision brings.
Most computer vision solutions are based on machine learning models that can be applied to visual input from cameras, videos, or images. The following information describes common computer vision tasks.
Image Classification
Image classification involves training a machine learning model to classify images based on their contents. For example, in a traffic monitoring solution you might use an image classification model to classify images based on the type of vehicle they contain, such as taxis, buses, cyclists, and so on.
Object Detection
Object detection machine learning models are trained to classify individual objects within an image and identify their location with a bounding box. For example, a traffic monitoring solution might use object detection to identify the location of different classes of vehicle.
Semantic Segmentation
Semantic segmentation is an advanced machine learning technique in which individual pixels in the image are classified according to the object to which they belong. For example, a traffic monitoring solution might overlay traffic images with "mask" layers to highlight different vehicles using specific colors.
Image Analysis
You can create solutions that combine machine learning models with advanced image analysis techniques to extract information from images, including "tags" that could help catalog the image or even descriptive captions that summarize the scene shown in the image.
Like "A man with dog" / "A man sitting on the chair" / "A back monitor" etc.
Face Detection, Analysis, and Recognition
Face detection is a specialized form of object detection that locates human faces in an image. This can be combined with classification and facial geometry analysis techniques to recognize individuals based on their facial features.
Example: let's suppose there is a group of people picture, where we detect the face of each person as per there facial geometry analysis techniques.
Optical Character Recognition (OCR)
Optical character recognition is a technique used to detect and read text in images. You can use OCR to read text in photographs (for example, road signs or store fronts) or to extract information from scanned documents such as letters, invoices, or forms.
Example: There is a wall over there written the name of the company "something". when we use OCR then its extract the text "something" the name of the company form the wall.
Subscribe to my newsletter
Read articles from Avanish Dubey directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by