- Image acquisition: The camera captures the desk photo.
- Preprocessing: Adjust lighting, normalize colors, resize the image.
- Feature extraction: Detect shapes, edges, and textures of objects.
- Object detection / recognition: Classify objects using trained models.
- Labeling / output generation: Assign names to the detected objects.
So roughly 5 major steps are involved.