I have finished my PhD in August 2015. Currently I'm a Research Scientist at Intel Labs. Previously, I was a PhD student in Electrical and Computer Engineering Department of Carnegie Mellon University, advised by James Hoe and Franz Franchetti. I was affiliated with CALCM and SPIRAL labs.
My research interests include computer architecture, memory subsystem, 3D stacked DRAM, architecture modeling and simulation, high-performance/energy-efficient computing systems, hardware acceleration for data-intensive applications, FPGA-based computing, algorithm/architecture co-design, hardware design/implementation and design automation.
Here are some of the projects I worked on during my PhD:
Near Data Processing Using 3D-stacked DRAM, 2013-ongoing: The key observation is that the internal resources within the 3D-stacked DRAM such as abundant fine-grain parallelism and bandwidth can be exploited efficiently at high throughput while staying within the limited power/thermal constraints via specialized processing units. In this project, I focus on highly concurrent data-intensive operations, such as data reorganizations, layout transformations, reduction, search, etc., performed in memory using specialized low-power processing elements integrated within 3D-stacked DRAM. Our initial results presented in IEEE High Performance Extreme Computing (HPEC'14) conference demonstrate major improvements over the state-of-the-art which achieved the Best Paper Presentation award. We are also working on system/software integration solutions to make the near memory computing easily accessible to the programmers (WoNDP'14). More details on this project can be found in our follow up works (ISCA'15 and MICRO'15).
Energy-Efficient and High-Performance Computing Systems (DARPA, PERFECT), 2012-ongoing: Our goal is to enable extremely energy-efficient and high-performance systems that can reach 75 GFLOPS/W. In particular, my efforts include developing experimental computer architectures, application-specific accelerators, hardware/RTL design, performance/power modeling, simulator development, hardware prototyping on FPGAs, developing tools for rapid design space exploration and automated hardware generation. Our contributions are published in several conferences (3DIC'13, ASAP'14, ICASSP'14, HPEC'14).
Algorithm/Architecture Co-design for DSP (SPIRAL), 2010-ongoing: Leveraging SPIRAL formalism to develop efficient architectures/algorithms for DSP applications via design space exploration and automated algorithm/architecture co-optimization. My research efforts demonstrate several interesting outcomes including novel FFT algorithms targeting large datasets (ICASSP'14, JSPS'15), detailed hardware design space exploration (ASAP'14) and FPGA-based implementations of the developed DRAM-optimized architectures (FPGA'12, FCCM'12).
Internship at Oracle, Santa Clara, CA, May-Aug 2012: Worked in architecture and performance modeling team. Designed and evaluated various mechanisms for the memory subsystem of the next generation SPARC processors focusing on improving the DRAM performance as well as the fairness between multiple applications running in parallel.