Debidatta Dwibedi

I am a Masters student in Robotics Institute, CMU. I am advised by Martial Hebert. I completed my undergrad from IIT Kanpur, where I was advised by Amitabha Mukerjee.

Email  /  CV  /  GitHub /  LinkedIn  /  Twitter

Research

My research is in the field of computer vision and machine Learning. I am particularly interested in inferring 3D properties of objects from images. In the past I have worked on object detection, pose estimation, reinforcement learning, game rule learning and image segmentation.

Deep Cuboid Detection: Beyond 2D Bounding Boxes
Debidatta Dwibedi, Tomasz Malisiewicz, Vijay Badrinarayanan and Andrew Rabinovich
Arxiv Preprint , 2016

Deep cuboid detector: finds cuboids in scenes and localizes their corners.

paper | abstract | bibtex

We present a Deep Cuboid Detector which takes a consumer-quality RGB image of a cluttered scene and localizes all 3D cuboids (box-like objects). Contrary to classical approaches which fit a 3D model from low-level cues like corners, edges, and vanishing points, we propose an end-to-end deep learning system to detect cuboids across many semantic categories (e.g., ovens, shipping boxes, and furniture). We localize cuboids with a 2D bounding box, and simultaneously localize the cuboid's corners, effectively producing a 3D interpretation of box-like objects. We refine keypoints by pooling convolutional features iteratively, improving the baseline method significantly. Our deep learning cuboid detector is trained in an end-to-end fashion and is suitable for real-time applications in augmented reality (AR) and robotics.

@inproceedings{dwibedi2014characterizing, title={Characterizing Predicate Arity and Spatial Structure for Inductive Learning of Game Rules}, author={Dwibedi, Debidatta and Mukerjee, Amitabha}, booktitle={European Conference on Computer Vision}, pages={323--338}, year={2014}, organization={Springer} }

Characterizing Predicate Arity and Spatial Structure for Inductive Learning of Game Rules
Debidatta Dwibedi and Amitabha Mukerjee
ECCV 2014 Workshop on Computer Vision + Ontology Applied Cross-Disciplinary Technologies , 2014

Inducing rules of games from observing people play games and solve puzzles in Kinect videos.

paper | abstract | bibtex | video

Where do the predicates in a game ontology come from? We use RGBD vision to learn a) the spatial structure of a board, and b) the number of parameters in a move or transition. These are used to define state-transition predicates for a logical description of each game state. Given a set of videos for a game, we use an improved 3D multi-object tracking to obtain the positions of each piece in games such as 4-peg solitaire or Towers of Hanoi. The spatial positions occupied by pieces over the entire game is clustered, revealing the structure of the board. Each frame is represented as a Semantic Graph with edges encoding spatial relations between pieces. Changes in the graphs between game states reveal the structure of a “move”. Knowledge from spatial structure and semantic graphs is mapped to FOL descriptions of the moves and used in an Inductive Logic framework to infer the valid moves and other rules of the game. Discovered predicate structures and induced rules are demonstrated for several games with varying board layouts and move structures.

@inproceedings{dwibedi2014characterizing, title={Characterizing Predicate Arity and Spatial Structure for Inductive Learning of Game Rules}, author={Dwibedi, Debidatta and Mukerjee, Amitabha}, booktitle={European Conference on Computer Vision}, pages={323--338}, year={2014}, organization={Springer} }

Other Projects

Some other unpublished work:

Playing Games with Deep Reinforcement Learning

Towards Pose Estimation of 3D Objects in Monocular Images via Keypoint Detection

HandNet: Using Faster R-CNN to Detect Hands in Egocentric Videos

A Grounded Framework for Gestures and its Applications


this guy's webpage is awesome