• Hi!
    I'm Ching-Yi

    Hi, I'm Ching-Yi Lin. I'm a third year Ph.D. in ICBio Lab and advised by professor Marc Dandin.

    My research interests lie in the intersection between Machine Learning and Circuit Design. I am now in Electrical and Computer Engineering Department in Carnegie Mellon University.

    Full CV 1-page CV

Education

Ph.D. in Carnegie Mellon University
2018-Present
Electrical and Computer Engineering
Advised by Radu Marculescu (2018-2019), Daniel Bankman(2020), Marc Dandin (2020~)

M.S. in National Tsing Hua University (Incomplete)
2017-2018
Electrical Engineering
Advised by Jing-Jia Liou

B.S. in National Tsing Hua University
2013-2017
Electrical Engineering
Class Representative

Publication

Switched-Capacitor SRAM-Based In-Memory Computing for 2-bit Quantized Neural Networks

Ching-Yi Lin, Daniel Bankman
Asilomar Conference on Signals, Systems, and Computers 2020 (Asilomar 2020)
Abstract accepted

Model Personalization for Human Activity Recognition

Ching-Yi Lin, Radu Marculescu
The Fourth International Workshop on Smart Edge Computing and Networking (SmartEdge 2020)
[Paper]

Memory- and Communication-Aware Model Compressionfor Distributed Deep Learning Inference on IoT

Kartikeya Bhardwaj, Ching-Yi Lin, Anderson Sartor, Radu Marculescu
ACM Transactions on Embedded Computing Systems (TECS 2019)
[Paper]

Experience

Mixed-signal quantized neural network (MSQNN)

Keywords: Mixed-signal circuit, Circuit non-ideality, Deep learning (Quantized NN)
Italian Trulli
MSQNN is a project targeting on building mixed-signal neural network chip in TSMC 28-nm CMOS, led by professor Daniel Bankman. We replaced the multi-stage N-to-1 digital adders with switched capacitor in analog domain. Our design also considered the circuit non-ideality, including capacitor mismatch, thermal noise and comparator offset. To operate in analog domain, we also trained quantized neural network to inference the model with integer operation.

I was major in computer architecture in my master degree, and responsible as TA of Advanced Computer Architecture

Verilog

85%

SimpleScalar [Site]

75%

RISC-V rocketchip [Site]

75%

Chisel3 [Site]

75%

gem5 [Site]

70%

Open Virtual Platform [Site]

70%

Vivado HLS [Site]

65%

Eurobot 2016

Eurobot 2016

An international robotic competition
June, 2016

IEEE SPCUP 2017

TA of Embedded System Lab

TA of Embedded System Lab

An international robotic competition
June, 2016

C

95%

C++

85%

Python

85%

RISC-V Assembly

85%

HTML

65%

Course Project

Hardware-aware Distributed Learning Algorithm

Course: 18-755 Network in the Real World

We built a heterogeneous system from 8 Raspberry Pis and performed the distributed machine learning on this platform. In specific, the Raspberry Pis are communicated via wired network for the neighboring nodes and wireless for the remote nodes.
Paper link: Here
Italian Trulli

Fine-grain Data Selection in Semi-supervised Learning

Course: 10-701 Introduction to Machine Learning

We modified current self-training algorithm and exploited some clustering algorithm, to improve the new label accuracy in the existed algorithm
Paper link: Here
Italian Trulli

FitFeet Smart Shoe System

Course: 18-651 Networked Cyber-Physical Systems

An algorithm for developing a smart shoe sole is outlined. This application is intended as a health monitoring system that would generate valuable data involving athletic abilities like walking and running. The device is able to understand walking patterns better than current market solution due to its location on the body.
Paper link: Here
Italian Trulli

Group-aware Cache Coherence Protocol

Course: 18-742 Computer Architecture & Systems

Most of the cache coherence protocol only has global state. With G-flag, which indicates the reduced coherence domain, and the modified hierarchial interconnection, we reduce 22% of scalibility cost of cache miss latency in PARSEC-3.0's blackscholes program. This group-aware technique can alleviate the affect of coherence communication overhead by describing the sharing status with more detail.
Paper link: Download here
Course representation: slides
Italian Trulli

Triple-DES Optimization on a CPU

Course: 18-645 How to write fast code

Optimizing a decryption algorithm (triple DES) depending on the parameters of our target processor, Intel Broadwell, including caches access time, size and number of functional unit. With table merging and operation parallelization, this work improves 10% in execution time. (It's still a single-threaded program)
Paper link: Here
Course representation: 6-page slides

Sports

HTML5 Bootstrap Template by colorlib.com

Tennis

I'm a tennis player. Not a powerful one, but patient and smart. I was the team leader of NTHUEE tennis team in 2014.

HTML5 Bootstrap Template by colorlib.com

Dart

I was a member in NTHU Darts club.
Dartslive2 profile
- 01: 65.3
- Cricket: 2.53