Princeton University

School of Engineering & Applied Science

Data Access Optimization in Accelerator-Oriented Heterogeneous Architecture through Decoupling and Memory Hierarchy Specialization

Speaker: 
Tae Jun Ham
Advisor: 
Prof. Martonosi
Location: 
Engineering Quadrangle J401
Date/Time: 
Monday, May 21, 2018 - 10:00am to 11:30am

Abstract
For the past fifty years, Moore's Law and Dennard Scaling have been playing important roles in both performance and energy efficiency of computer systems. Unfortunately, they are not likely to continue, and computers no longer benefit from technology scaling as much as they did in the past. Recently, specialized hardware accelerators have emerged as a promising alternative to general-purpose computing for their potential to achieve orders of magnitude speedup and energy efficiency improvements on compute-intensive applications. However, achieving the full potential of accelerators on data-intensive applications remains a challenge since the bottlenecks of such applications do not lie on computation, but data movement. It is particularly problematic because data accesses have become large parts of today's important workloads used for data analytics and scientific computing.
 
To address this limitation, this thesis presents hardware and software techniques which can be utilized to design a system that can effectively accelerate data-intensive workloads. Specifically, this thesis addresses the two most important aspects in accelerating such workloads ---hiding memory latency and reducing memory bandwidth consumption. The first part of the thesis attacks the memory latency challenge in accelerator-oriented systems by proposing the novel framework which provides latency tolerance to accelerators without requiring programmer effort. The second part of the thesis presents a way to attack the memory bandwidth challenge for accelerators through the use of customized memory hierarchy and data access optimizations.
 
The presented data access optimization techniques enable data-intensive workloads to benefit from specialized, heterogeneous systems without being limited by data accesses. Considering a trend of exponentially increasing demand for data-intensive computing, the techniques presented in this thesis will work as useful tools for acceleration of such important workloads.