Keshav Seshadri

Doctoral Research Projects

A Unified Framework for Pose, Expression, and Occlusion Tolerant Automatic Facial Landmark Localization

My most recent work has involved developing a robust facial landmark localization (facial alignment) algorithm that can handle simultaneous variations in facial pose, and expression, while also being able to deal with varying levels of facial occlusion. My current algorithm not only performs more accurately than several state-of-the-art facial alignment algorithms on challenging real-world datasets, but also provides a measure of confidence in the fitted landmarks, i.e., it is capable of providing performance feedback in the form of occlusion/misalignment labels for each fitted landmark. Such an algorithm will be of great use to any facial analytic system and its design ensures great generalizability for any future task that needs to harness local textural information around key facial landmarks.

Fusing Shape and Texture Information for Improving Facial Recognition

Conventional face recognition algorithms simply crop out the facial region, extract features from the crop and match based on these features. We aimed at a different approach to harness more information from images and hence boost facial recognition rates. Using our dense landmarking scheme, we performed experiments in which the shape information contained purely in the locations of these landmarks was harnessed while shape-free texture information was harnessed separately by warping all faces to a common mean face. Purely shape based features and purely texture based features were then used for matching and the results were subsequently fused. The results we obtained demonstrated that 1) purely shape based recognition is fairly discriminative 2) significant performance improvement can be gained by better registration of texture using shape information and 3) fusing shape and texture based methods consistently boosts the performance of subspace-based approaches such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) etc. To the best of our knowledge, this is the first large scale study of the roles of shape and texture in facial recognition as we carried out our tests on the vast and challenging NIST Facial Recognition Grand Challenge (FGRC) ver 2.0 database.
[Paper on IEEE Xplore] [Slides] [Poster]

Automatic Facial Landmark Tracking in Videos using Kalman Filter Assisted ASMs

A colleague of mine (Utsav Prabhu) and I were able to apply ASMs to the task of tracking the previously mentioned 79 facial landmarks across the frames of different video sequences in which the subject showed rapid in-plane rotation and out of plane pose variation. This task was accomplished by Kalman filtering the landmark coordinates. The predictive mechanism of the Kalman filter ensured accurate initialization of the ASM on the next frame (without the need for face detection) while its corrective mechanism, treated the landmark locations produced by the ASM as noisy observations that were refined to produce more accurate results. We benchmarked our approach against naive methods that 1) did not harness any temporal information and treated each frame independently 2) initialized the ASM on frame n+1 using the fitting results on frame n but without Kalman filtering and found that our approach produced far lower fitting error. The applications of this work include tracking and facial recognition in surveillance footage. Our system is not currently capable of operating in real-time but can be used for post processing.
[Paper on Springer] [Slides]

Automatic Facial Landmarking using Active Shape Models (ASMs)

Active Shape Models (ASMs) are deformable templates that can be trained to automatically position a pre-defined set of landmarks along the contours of an object of interest. They are most commonly used for facial landmarking and if initialized well, using a suitable face detector, produce fairly accurate results. My initial work involved developing a Modified Active Shape Model (MASM) implementation to automatically localize a dense set of 79 landmarks on frontal faces. MASM formed the bedrock for many projects and protoypes developed at the CMU CyLab Biometrics Center including a long range iris recognition system, facial recognition algorithms, periocular based recognition algorithms, 3D facial modeling using Generic Elastic Models (3D-GEM), single image based superresolution techniques, gneder and ethnicity classifiers, age estimation algorithms, and facial hair segmentation algorithms.
[Paper on IEEE Xplore] [Slides]

Graduate Course Related Projects

A Local Approach to Face Recognition - Machine Learning (10-701) - Fall 2009 Course Project

As part of a machine learning course, Utsav and I worked on a local approach to face recognition. Small local pataches were built around key facial landmarks and features were extracted from these patches using Gabor filters. A multi-class SVM was then used to classify images of teh various people in our test datatsets (we used subsets of the Multiple Biometrics Grand Challenge (MBGC) still face challenge database and the CMU Multi-PIE (MPIE) database). Our results were failry promising but could use more work in order to determine the optimal patch sizes, best Gabor filters as well as more research into alternative feature extraction and classification methods.
[Paper] [Poster]

Performance of Face Recognition Algorithms on Blurred and Partially Occluded Images - Pattern Recognition Theory (18-794) - Spring 2008 Course Project

I was part of a group that compared the effectiveness of several face recognition algorithms when applied on partially occluded or blurred images. We blured several images from the PIE database using disk blurring, Gaussian blurring, motion blurring etc as well as occluded (blacked out) different portions of the image and used these corrupted images for facial recognition. Our findings showed that Minimum Average Correleation Energy (MACE) filters were best able to deal with occluded data but did not perform as well with blurred data. LDA showed reasonable performance under both conditions and could also be used for such tasks with suitable enhancements such as the use of Kernel LDA (KLDA) etc.
[Paper] [Slides]

Streaming Video over Wireless Networks for Eye Movement Monitoring - Wireless Networks Course (18-759) - Spring 2008 Course Project

Was part of a team that put together a system that could wirelessly transmit images captured by a camera at 30 fps to a receiver at distances of upto 500 metres with at throughput of around 700 kbps. Our project used a Point Grey Research (PGR) firefly camera and the Real Time Transport (RTP) Protocol for the purpose.
[Paper] [Slides]