XII UNIT 3 HOW CAN MACHINES SEE?
HOW CAN MACHINES SEE?
The answer is with the help of Computer vision .
Computer
Vision (CV) helps machines "see" and understand images, just
like humans do with their eyes. But instead of retinas and optic nerves,
machines use:
- Cameras 🎥 → To capture images.
- Data & Algorithms 💻 → To analyze and interpret the images.
- Deep Learning Models 🤖 → To recognize
patterns, objects, and actions.
CV allows machines to make decisions based on images, like:
✅ Identifying objects (cars, faces, animals).
✅ Inspecting products for defects in industries.
✅ Detecting diseases from X-rays or MRIs.
✅ Assisting self-driving cars in
understanding the road.
Since AI can analyze thousands of images faster than humans, it often performs better in tasks like facial recognition and object detection.
How Digital Images Work
- A digital image is just a picture stored on a computer, made up of tiny squares called pixels (short for "picture elements").
- Each pixel has a color assigned to it, helping to build the full image.
Interpreting Digital Images in Computers
🔹 When a computer processes an image, it
doesn’t "see" like humans—it reads pixels as numbers!
- A black-and-white image is
stored using numbers ranging from 0 to 255:
- 0 = Black
- 255 = White
- Values in between represent
shades of gray.
Imagine a
chessboard 🏁:
- The black squares = pixels
with 0 (black).
- The white squares = pixels
with 255 (white).
For color
images, each pixel is stored using three values (Red, Green, Blue =
RGB). This combination helps AI recognize different colors.
💡 The higher the number of pixels, the clearer the image (higher resolution). Lower pixels make the image blurry or pixelated.
5 Stages of Computer Vision
Computer vision helps computers understand images. It follows five main steps:
This is the first step where digital images or videos are
collected. These images can come from:
- Digital cameras 📷
- Scanners 📄
- Design software 🎨
In medicine, special devices like MRI and CT scans help capture detailed images of tissues inside the body.
2. Preprocessing (Cleaning the Image)
Before using an image for AI tasks, we clean and improve it.
Some common steps:
- Noise Reduction: Removes blurry spots and distractions.
- Normalization: Adjusts brightness and contrast so images look similar.
- Resizing/Cropping: Makes all images the same size for easy analysis.
- Histogram Equalization: Makes dark areas clearer and bright areas less intense.
3. Feature Extraction (Finding Important Parts)
The AI picks out useful details from the cleaned image:
- Edge detection: Finds the outlines of objects.
- Corner detection: Spots sharp bends in shapes.
- Texture analysis: Checks for patterns like roughness or smoothness.
- Color-based features: Uses colors to separate different objects.
Instead of humans picking features manually, deep learning (CNNs) helps AI learn automatically which features matter.
4. Detection & Segmentation
There are two main tasks:
1. Single Object Tasks (Recognizing One Object)
Sometimes, an image has only one important object to analyze. AI focuses on this one thing and performs two actions:
(i) Classification: AI identifies what the object is.
- Example: Looking at a picture of an animal and deciding it's a "dog."
- Methods used: K-Nearest Neighbors (KNN) (for labeled data), K-Means Clustering(for grouping objects without labels).
(ii) Classification + Localization: AI not only identifies the object but also marks its location within the image.
- Example: AI recognizes a "dog" and draws a box around it.
- The box is called a bounding box, which highlights where the object is.
This step helps AI find several objects inside an image instead of just one. There are two key techniques:
(i) Object Detection: AI scans the image to find multiple objects and classifies them. AI places bounding boxes around each detected object.
- Example: Looking at a street scene and identifying cars, pedestrians, and traffic signs.
- Some common AI methods for object detection:
- R-CNN (Region-Based Convolutional Neural Network) → Divides the image into sections and searches for objects.
- YOLO (You Only Look Once)→ A fast method that looks at the whole image at once.
- SSD (Single Shot Detector) → Quickly detects objects in real-time.
- Example: A medical scan where AI highlights only tumors instead of marking a whole area.
- AI precisely separates buildings, cars, and trees in a city image.
Types of Image Segmentation
1.Semantic Segmentation: AI groups similar objects together. Objects belonging to the same class are not differentiated. In this image for example the pixels are identified under class animals but do not identify the type of animal.
- Limitation: It doesn’t separate individual objects.
2. Instance Segmentation: AI separates each object individually. All the objects in the image are differentiated even if they belong to the same class. In this image for example the pixels are separately masked even though they belong to the same class
- Used in advanced tasks where identifying individual objects matters.
Importance
- AI can recognize objects in real-world environments (self-driving cars, security cameras, medical scans).
- Helps in medical diagnosis (spotting diseases in X-rays).
5.High-Level Processing (Understanding the Image)
Now the AI interprets the image to make decisions, like:
- - Recognizing objects 🚗
- - Understanding scenes 🌆
- - Analyzing images in medical scans, self-driving cars , and security cameras.
1. Understanding Images: Computer vision doesn’t just need to see objects—it needs to understand them. If it can’t understand what is happening in an image, it cannot give correct answers.
2. Taking Good Pictures: Computer vision needs clear images. But things like bad lighting, different angles, and objects blocking each other make it hard to get good pictures. Without good images, the computer makes mistakes.
3. Privacy Problems: Using cameras and facial recognition can invade people’s privacy. Many people worry about how their pictures are used, so these technologies need rules and careful use.
Computer vision has grown a lot over the years. It started with simple tasks like reading basic images, but now it can understand pictures and videos almost like humans. This big progress happened because of powerful deep learning technology and large amounts of training data.
In the future, computer vision will become even more important. It can help in many areas, such as:
Healthcare: giving personalized and faster medical diagnoses
AR (Augmented Reality): creating more realistic and interactive experiences
Daily Life: improving safety, transportation, education, and more
Comments
Post a Comment