Zhiding Yu   禹之鼎


Ph.D. Candidate

Dept. of Electrical and Computer Engineering, Carnegie Mellon University
B200 Wing, Hamerschlag Hall, Carnegie Mellon University

5000 Forbes Ave, Pittsburgh, PA 15213

Email: yzhiding AT andrew.cmu.edu

A pdf version of my CV is available here: [PDF]


I am currently a Ph.D. candidate at the Department of Electrical and Computer Engineering, Carnegie Mellon University. Before that I obtained the M.Phil. degree from the Department of Electrical and Computer Engineering, The Hong Kong University of Science and Technology. I did my computer vision research intern at Adobe in the summer of 2013.

My current research interests mainly focus on learning discriminative unary potentials in graphical models, fomulating relative relative relationships between unary potentials and pairwise potentials as well as incorporating highlevel information for scene understanding. I'm also interested in problems related to graphical density estimation / mode seeking and image segmentation.

I am the awardee of the 2009-2010/2011-2012 HKTIIT Post-Graduate Excellence Scholarships and Carnegie Institute of Technology Dean's Tuition Fellowship.

Educational Background

Work Experience

Selected Publications


Journal Papers

Conference Papers

Selected Projects

Multi-Class Object Semantic Segmentation

I am particularly interested in learning / inferencing discriminative unary potentials, and formulating them under some more generalized MRF/CRF models.

Image Selection for PixelTone

During the summer internship in 2013 at Adobe, I developed several new features for the PixelTone Project.

The purpose of this project is to mainly reinforce the image selection and editing ability of the PixelTone prototype. During the summer, several new features have been incorporated:

  • An interactive graph cut and quick matting tool for general purpose image selection.

  • An intelligent content-aware scribbling tool that is able to tolerate scribbling errors, e.g.: scribbling over object boundaries.

  • A much more powerful natural language interaction system for natural color identification and indication. (e.g.: “What color is it? Make it a MILD SKY BLUE and INTENSE PEACH color…” There are over 2000 terms to describe the natural colors! This is handled by the Sedona NLP engine.)

Automatic Clear Path Detection for General Motors Autonomous Vehicle

GM building driverless car with Carnegie Mellon

This project is under the main project between CMU and GM in an effort to develope fully autonomous vehicles for the future.

The purpose of doing Clear Path Detection (CPD) is to provide useful cues for subsequent vehicle guiding and planning operations. It serves an important complement for general object detectors and bypasses the complex problem of building detectors for every possible objects that could appear.

CPD typically suffers from strong shadows. we propose a shadow robust CPD scheme that can be formulated under a generalized Markov Random Field framework, where additional parameters are introduced to model the relative relationships between the unary potentials and pairwise potentials. We apply our proposed method to the challenging problem of rear-view CPD problems with only a monocular low quality camera.

A premitive result video is available here.

Tree Embedded Mode Seeking for Manifold Structured Data Clustering and Image Segmentation

We present a novel framework for tree-structure embedded density estimation and its fast approximation for mode seeking. Given any undirected, connected and weighted graph, the density function is defined as a joint representation of the feature space and the distance domain on the graph’s spanning tree. Tree domain mode seeking can not be directly conducted by traditional mean shift. Thus we address this problem by introducing node shifting with force competition and its fast approximation.

This work appears in CVPR 2011.

Automatic Object Segmentation from Large Scale 3D Urban Point Clouds


We present a system that can automatically segment objects in large scale 3D point clouds obtained from urban ranging images. The system consists of three steps: The first one involves a ground detection process that can detect relatively complex terrain and separate it from other objects. The second step superpixelizes the remaining objects to speed up the segmentation process. In the final step, a manifold embedded mode seeking method is adopted to segment the point clouds. Even though the segmentation of urban objects is a challenging problem in terms of accuracy and problem scale, our system can efficiently generate very good segmentation results. The proposed manifold learning effectively improves the segmentation performance due to the fact that continuous artificial objects often have manifold-like structures.

This work appears in ACM-MM 2011.

ELEC 547 Convex Optimization Course Project: Mode Seeking with Convex Shift

Regarding the limit of mode seeking image segmentations, we propose several improvements in this project:

1. We show that for mean shift with a linear kernel, each kernel shift can actually be formulated in a convex form. This can be also genrealized for several convex metrics - e.g., KL divergence and Jeffrey divergence.

2. We propose an interactive segmentation algorithm based on mode seeking. We show this problem can be formulated as a convex problem

The project report is available here. An improved version of this work later appeared in CVPR 2012.

Honors and Awards


Friends & Collaborators


This page has been visited