Princeton University

School of Engineering & Applied Science

Exploring Data Compression and Random-Access Reduction to Mitigate the Bandwidth Wall for Manycore Architectures

Tri Nguyen
David Wentzlaff
Engineering Quadrangle B327
Tuesday, September 4, 2018 - 3:00pm

The growing performance gap between computer processors and memory bandwidth limits the throughput and potential of modern multi-core and manycore architectures. Commercial processors such as the Intel Xeon Phi and NVIDIA or AMD GPUs needed to use expensive memory solutions like high-bandwidth memory (HBM) and 3D-stacked memory to satisfy the huge bandwidth demand of the growing core-count over each product generation. Without a solution for the memory bandwidth issue, computation cannot get better.
Data compression and random-access reduction are promising approaches to increase bandwidth without raising costs. This thesis makes three specific contributions to the state-of-the-art. First, to reduce cache misses, we propose an on-chip cache compression method that drastically increases compression performance over prior work. Second, we propose a novel link compression framework that exploits the on-chip caches themselves as a massive and scalable compression dictionary. Last, to overcome poor random-access performance of nonvolatile memory and make it more attractive as a DRAM replacement, we propose a multi-undo logging scheme that seamlessly logs memory writes sequentially and maximizes NVM I/O operations per second (IOPS).
As a common principle, this thesis seeks to overcome the bandwidth wall for manycore architectures not through expensive memory technologies but by assessing and exploiting workload behavior, and not through burdening programmers with specialized semantics but by implementing software-transparent architectural improvements.