Object Detection

Can you identify the actor in this image?

Yes, he is none other than Tom Cruise who has acted in many movies like the Mission Impossible series (Does he ever age?!).

But how were you able to identify him from the image? And how can a computer identify him? While we identified him using our eyes and brain (and the fact that he doesn’t seem to age), the computer uses computer vision to identify celebrity information from the image. 

How? You will get the answer in this lesson!

Topic Covered in the Lesson

  1. Human vision
  2. Computer vision and its applications
  3. How self-driving car works
  4. Artificial Intelligence extension in PictoBlox
  5. Computer vision blocks in PictoBlox

Key Learning Outcomes

At the end of  the lesson, you will be able to:

  1. Use Artificial Intelligence blocks in PictoBlox to identify the following in images:
    1. Brand
    2. Celebrity
    3. Objects
    4. Image Tags
    5. Image Description

The following example shows a project that identifies the object in the image, i.e. a person, who that person is, i.e. a celebrity, his name, what he’s wearing, and the brand his clothes belong to.
Computer Vision

  1. Make AI projects in PictoBlox using computer vision.
  2. Understand how computer vision works.
  3. Understand the application of computer vision in various industries.

Let’s begin!

How Human Vision Works?

To understand computer vision, first, we must look at how human vision works.

Human Vision

  1. Capture image: Humans capture images using their eyes. The image captured is formed on the retina which is similar to how the camera captures the image but in a very raw format.
    Eye Structure
  2. Identify the objects and their features: The raw image is then transferred to the brain via the optical nerves for processing. The brain starts to identify different objects like a candle, human, chair, and many others along with their features such as size, color, shape, and others.
  3. Extract information: In this step, our brain compares the features of the object to its past knowledge to gather information. E.g., it can differentiate between your father and your mother because you can distinguish their visual features.
  4. Act: Once you get the higher-level information, you can start acting on it. E.g., if you can identify that a ball is coming to your face, then you can move aside to avoid hitting the ball. 

All these steps happen at a very fast pace due to the perfection of the human eye and brain coordination.  

Computer Vision

Computer vision also follows a similar approach as human vision.

Computer Vision

Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images.

Computer Vision

Example: Self Driving Car Using Vision

A self-driving car is a vehicle that is capable of sensing its environment and moving safely with little or no human input.

For this example, let us consider that a self-driving car can go forward, turn left, right, or stop. Let’s see how the car would react if a pedestrian comes in front of it.

  1. Acquire: Self-driving cars use cameras to acquire images. They acquire and process images are at a very high rate. Let’s consider that our camera has acquired this image:
    Girl Crossing Road
  2. Process: The computer starts to identify all the objects in the image and make a list of the objects with their position. In this case, there is something on the road. The computer still has no information about what object it is.
  3. Analyze: The computer then classifies each object into different categories. In this case, it identifies the object as a girl. It also tags some information to the object as harmfulness, distance, and other parameters. These tags are the higher-level information used to make a decision.
    Girl Analyse
  4. Act: Based on the higher-level information the computer can act. In this case, the car will stop. 

recognized () () () block

recognized () () ()

In feature recognition, you can get the location and other parameters of the recognized celebrity, brand, and object using the recognized () () () block. You can get the following parameters using this block:

recognised ()()()

  1. x position: Reports the x position of the identified object. 
  2. y position: Reports the y position of the identified object.
  3. width: Reports the width of the identified object.
  4. height: Reports the height of the identified object.
  5. confidence: Reports the confidence of the identified object. 0 is less likely and 1 is more likely. 

Activity: Locating Objects

Let’s make a script that makes a bounding box on the identified object.

Identify Location of the Object

We’ll follow the following process:

  1. Identifying the objects from the image.
  2. Running the script in a loop for each object.
  3. Making the bounding box for each object based on its X and Y location.

Let’s begin! 

Setting Up the Stage

  1. Open a new project in PictoBlox.
  2. Select evive as your board from the Board tab on the menu bar.
    Choose board
  3. Click the Add Extension button in the bottom left corner.
    add extension button
  4. A modal will open will all the available extensions. Select the Artificial Intelligence extension from the library.
    AI Extension
  5. Download the image from here: https://learn.thestempedia.com/wp-content/uploads/2020/04/Kids-and-Bus.jpg
    Kids and Bus
  6. Upload the image as a backdrop.Upload Backdrop
  7. Add a new sprite named Box from the sprite library:
    New Sprite
  8. Delete Tobi, select the Box sprite, and switch to the Code tab.
  9. Add a when flag clicked block in the scripting area.
  10. Snap a hide block from the Looks palette.
  11. Add a recognize () in image from () block and select image features and stage as inputs.
    Object Location 1 

Locating Objects on the Stage

  1. We’ll continue with the same script. Make a variable named Object.
    Variable
  2. Add a set () to () block. Change the variable to Object and set the value as 0.
  3. Add a repeat until () block. Drop an () = () block in the condition.
  4. In the first input, add a recognized () count block and select object from the drop-down. In the second input, add the Object variable. 
    Object Location 2
  5. Inside the loop, add a change () by () block from the Variables palette. Change the variable to Object.
  6. Snap a create clone of () block from the Control palette and select myself from the drop-down. Selecting myself means that you want to clone the same sprite you are writing the script for. 
  7. Add a wait () seconds block in the loop. With this, our main script is ready.
    Object Location 3

Drawing the Bounding Box

Follow the steps below for drawing the bounding box:

  1. Add a when I start as a clone block into the scripting area from the Control palette.
  2. Snap a set size to ()% block below the when I start as a clone block.
  3. Add a recognized () () () block as the input of the set size to ()% block. Change the type to object and option to width. Next, add the Object variable as the input of the object number.
    Object Location 4
  4. Next, add a set x to () block and a set y to () block and repeat step 3 for both the blocks as shown below:
    Object Location 6
  5. Snap a show block from the Looks palette.
  6. Add a say () block below the show block.
  7. Add two join () () blocks.
  8. Display the object name and object confidence using the say block as shown below.

Object Location 7

Your project is complete! Click the green flag to run the script.

Identify Location of the Object

evive Explore
Explore: Add other backdrops from the internet and try to identify objects.

Application of Computer Vision

Computer Vision is used for various purposes in a wide range of industries such as healthcare, agriculture, insurance, automotive industries, etc.

  • Self Driving Cars: We have already had a look at self-driving cars and how computer vision helps them navigate.

Self Driving Car

  • Food Industries: Computer vision is used to identify food items and segregate the items using robots. Below is one example.

    Indutrial Application

  • Agriculture: Computer vision is used in agriculture for identifying crop diseases and crop production. E.g., in this image, CV is used to calculate the grape production.

     

    Grape Identification

Next Session

Session 5

Speech Recognition

Download PictoBlox – Coding and AI Project Making App

Incase of any difficulty with PictoBlox, please feel free to write us at support@thestempedia.com

Copyright 2021 – Delhi Government All rights reserved | Managed by Valeur Fabtex Private Limited | Technology Partner – STEMpedia