The two papers present work on improving efficiency for developers working with hardware accelerators and improving training performance of deep recommendation systems.
Thirteen researchers affiliated with the University of Michigan’s Applications Driving Architectures Lab were selected to present their work at this year’s Design Automation Conference (DAC) taking place July 10-14 in San Francisco. DAC is the oldest and largest conference devoted to the design and automation of electronic systems, embedded systems and software, and intellectual property.
Read more about the projects:
Authors: Valeria Bertacco, ADA Center Director (University of Michigan), Todd Austin, ADA Principal Investigator (University of Michigan), David Brooks, ADA Principal Investigator (Harvard University), Sharad Malik, ADA Principal Investigator (Princeton University), Zachary Tatlock, ADA Principal Investigator (University of Washington), Gu-Yeon Wei, ADA Principal Investigator (Harvard University), Thomas Wenisch, ADA Principal Investigator (University of Michigan)
Abstract: The ADA Center set out in 2018 to reignite innovation in computing and to enable the design of computing systems for the 2030decade. It planned to do so by developing novel hardware architecture solutions, embracing emerging devices at the system’s scale, and crafting new design flows that are accessible to a much larger engineering population than today. This work provides a retrospective of ADA’s accomplishments, and lessons learned along the way. It also provides a discussion of future research directions informed by the ADA Center’s learning.
Authors: Nicholas Wendt, Ph.D. Student (University of Michigan), Todd Austin, ADA Principal Investigator (University of Michigan), Valeria Bertacco, ADA Center Director (University of Michigan)
Abstract: Domain-specific languages (DSLs) improve developers’ productivity by abstracting away low-level details of an algorithm’s implementation. These languages often provide powerful primitives to describe complex operations, potentially granting flexibility during compilation for hardware acceleration.
This work proposes PriMax, a general methodology for effectively mapping DSL applications to hardware accelerators. Using benchmark results, it constructs a decision tree that selects between multiple accelerated primitive implementations to maximize a target performance metric. In our graph analytics case study with two accelerators, PriMax produces a geomean speedup of 1.57x over CPU, higher than either target accelerator alone and close to the ideal 1.58x speedup.
Authors: Chun-Feng Wu, Post-doctoral Fellow (Harvard University), Carole-Jean Wu (Meta AI), Gu-Yeon Wei, ADA Principal Investigator (Harvard University), David Brooks, ADA Principal Investigator (Harvard University)
Abstract: As the sizes and variety of training data scale all the time, data preprocessing gradually becomes a performance bottleneck for training deep recommendation systems. This challenge becomes more serious when training data is stored in Solid-State Drives (SSDs). Due to the access behavior gap between recommendation systems and SSDs, unused training data may be read and filtered out during preprocessing. We advocate a joint management middleware to avoid reading unused data by bridging the access behavior gap. The evaluation results show that our middleware can effectively improve the performance of the data preprocessing phase so as to boost training performance.