Prof. M. Salameh/Vision

  1. Name three applications of Computer Vision you can recognize on your phone

    Common phone applications that appear to use computer vision include Face ID (facial recognition for unlocking), AR filters in social media apps, and automatic photo organization (specifically, those that recognize objects and people).

  2. Take a photo of your desk and ask AI to identify or name all the objects.

    Teacup and stacked saucers on a desk
    1. How many steps are involved in this identification process?

      1. Image Acquisition: The picture is taken by the phone's camera.
      2. Pre-processing: involves normalizing the image (say, noise removal, color correction).
      3. Object Detection: the process of finding and bounding objects in an image. That noted, an algorithm (say Convolutional Neural Network) analyzes the image.
      4. Classification: every object that is detected and classified into its own category, number of items, arrangement, and other classification aspects (e.g., "cup," "plate").
      5. Output: At this stage, the user is shown the findings (bounding boxes, text labels, etc.).
    2. Can you identify any challenges for the AI

      Occlusion: The technology could find it difficult to find out the precise count, more so, if objects are stacked (like saucers), as they are in our image. Notably, could possibly perceive them as a single object.

      Lighting: Poor lighting could potentially skew features.

      Similarity: If the model is not very accurate, small details (say, the golden rim on the cup and saucers) could lead to incorrect classification. This is because the technology may find it difficult to separate what the objects are.

    3. Ask AI to highlight or circle a specific item on your desk. Describe the results and possible challenges

      A bounding box (after instructing the platform to do so) appears to be drawn around the teacup. However, the following challenges arose:

      Precision: There appears to be a small misalignment in the circle (around the handle).

      Shapes: We observe that highlighting is not entirely accurate, and this is probably because the cup has an uneven shape.

      Ambiguity: The circle on the cup is done incorrectly, and this might have been influenced by the failure of the platform to understand the request.

  3. Several applications involve decision-making processes that rely on Computer Vision.

    Name some applications that you believe are not trustworthy or safe.

References

  1. Richard Szeliski (2022). Computer Vision: Algorithms and Applications (2nd ed.). Springer. Book site
  2. Ian Goodfellow, Yoshua Bengio, Aaron Courville (2016). Deep Learning. MIT Press. (See Chapter 9 on convolutional networks.) Online book
  3. Rafael C. Gonzalez, Richard E. Woods (2018). Digital Image Processing (4th ed.). Pearson.