I am a PhD student in Electrical and Computer Engineering Department of Carnegie Mellon University, My advisors are James Hoe and Franz Franchetti. I am currently affiliated with CALCM and SPIRAL labs.
My research interests include computer architecture, memory subsystem, 3D stacked DRAM, architecture modeling and simulation, high-performance/energy-efficient computing systems, hardware acceleration for data-intensive applications, FPGA-based computing, algorithm/architecture co-design, hardware design/implementation (RTL/Verilog) and hardware design automation.
Here are a few ongoing and completed projects:
Near Data Processing Using 3D-stacked DRAM, 2013-ongoing: The key observation is that the internal resources within the 3D-stacked DRAM such as abundant fine-grain parallelism and bandwidth can be exploited efficiently at high throughput while staying within the limited power/thermal constraints via specialized processing units. In this project, I focus on highly concurrent data-intensive operations, such as data reorganizations, layout transformations, reduction, search, etc., performed in memory using specialized low-power processing elements integrated within 3D-stacked DRAM. Our initial results presented in IEEE High Performance Extreme Computing (HPEC'14) conference demonstrate major improvements over the state-of-the-art which achieved the Best Paper Presentation award. We are also working on system/software integration solutions to make the near memory computing easily accessible to the programmers (WoNDP'14).
Energy-Efficient and High-Performance Computing Systems (DARPA, PERFECT), 2012-ongoing: Our goal is to enable extremely energy-efficient and high-performance systems that can reach 75 GFLOPS/W. In particular, my efforts include developing experimental computer architectures, application-specific accelerators, hardware/RTL design, performance/power modeling, simulator development, hardware prototyping on FPGAs, developing tools for rapid design space exploration and automated hardware generation. Our contributions are published in several international conferences (3DIC'13, ASAP'14, ICASSP'14, HPEC'14).
Algorithm/Architecture Co-design for DSP (SPIRAL), 2010-ongoing: Leveraging SPIRAL formalism to develop efficient architectures/algorithms for DSP applications via design space exploration and automated algorithm/architecture co-optimization. My research efforts demonstrate several interesting outcomes including novel FFT algorithms targeting large datasets (ICASSP'14, JSPS'15), detailed hardware design space exploration (ASAP'14) and FPGA-based implementations of the developed DRAM-optimized architectures (FPGA'12, FCCM'12).
Internship at Oracle, Santa Clara, CA, May-Aug 2012: Worked in architecture and performance modeling team. Designed and evaluated various mechanisms for the memory subsystem of the next generation SPARC processors focusing on improving the DRAM performance as well as the fairness between multiple applications running in parallel.