default search action
28th HPCA 2022: Seoul, South Korea
- IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022, Seoul, South Korea, April 2-6, 2022. IEEE 2022, ISBN 978-1-6654-2027-3
- Matthew Denton, Herman Schmit:
Direct Spatial Implementation of Sparse Matrix Multipliers for Reservoir Computing. 1-11 - Di Wu, Joshua San Miguel:
uSystolic: Byte-Crawling Unary Systolic Array. 12-24 - Yi Huang, Zhiyu Chen, Dai Li, Kaiyuan Yang:
CAMA: Energy and Memory Efficient Automata Processing in Content-Addressable Memories. 25-37 - Yuji Chai, Glenn G. Ko, Wei-Te Mark Ting, Luke Bailey, David Brooks, Gu-Yeon Wei:
CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators. 38-52 - Shuwen Deng, Bowen Huang, Jakub Szefer:
Leaky Frontends: Security Vulnerabilities in Processor Frontends. 53-66 - Sowoong Kim, Myeonggyun Han, Woongki Baek:
DPrime+DAbort: A High-Precision and Timer-Free Directory-Based Side-Channel Attack in Non-Inclusive Cache Hierarchies using Intel TSX. 67-81 - Yujie Cui, Chun Yang, Xu Cheng:
Abusing Cache Line Dirty States to Leak Information in Commercial Processors. 82-97 - Mengming Li, Chenlu Miao, Yilong Yang, Kai Bu:
unXpec: Breaking Undo-based Safe Speculation. 98-112 - Liang Zhou, Laxmi N. Bhuyan, K. K. Ramakrishnan:
Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search. 113-125 - Zixuan Wang, Joonseop Sim, Euicheol Lim, Jishen Zhao:
Enabling Efficient Large-Scale Deep Learning Training with Cache Coherent Disaggregated Memory Systems. 126-140 - Liu Ke, Udit Gupta, Mark Hempstead, Carole-Jean Wu, Hsien-Hsin S. Lee, Xuan Zhang:
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation. 141-154 - Shuang Chen, Angela Jin, Christina Delimitrou, José F. Martínez:
ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud. 155-168 - Yejin Lee, Hyunji Choi, Sunhong Min, Hyunseung Lee, Sangwon Beak, Dawoon Jeong, Jae W. Lee, Tae Jun Ham:
ANNA: Specialized Architecture for Approximate Nearest Neighbor Search. 169-183 - Qinggang Wang, Long Zheng, Jingrui Yuan, Yu Huang, Pengcheng Yao, Chuangyi Gui, Ao Hu, Xiaofei Liao, Hai Jin:
Hardware-Accelerated Hypergraph Processing with Chain-Driven Scheduling. 184-198 - Pengcheng Yao, Long Zheng, Yu Huang, Qinggang Wang, Chuangyi Gui, Zhen Zeng, Xiaofei Liao, Hai Jin, Jingling Xue:
ScalaGraph: A Scalable Accelerator for Massively Parallel Graph Processing. 199-212 - Shougang Yuan, Amro Awad, Ardhi Wiratama Baskara Yudha, Yan Solihin, Huiyang Zhou:
Adaptive Security Support for Heterogeneous Memory on GPUs. 213-228 - Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, Jaehyuk Huh:
TNPU: Supporting Trusted Execution with Tree-less Integrity Protection for Neural Processing Unit. 229-243 - Wenjie Xiong, Liu Ke, Dimitrije Jankov, Michael Kounavis, Xiaochen Wang, Eric Northup, Jie Amy Yang, Bilge Acun, Carole-Jean Wu, Ping Tak Peter Tang, G. Edward Suh, Xuan Zhang, Hsien-Hsin S. Lee:
SecNDP: Secure Near-Data Processing with Untrusted Memory. 244-258 - Poulami Das, Christopher A. Pattison, Srilatha Manne, Douglas M. Carmean, Krysta M. Svore, Moinuddin K. Qureshi, Nicolas Delfosse:
AFS: Accurate, Fast, and Scalable Error-Decoding for Fault-Tolerant Quantum Computers. 259-273 - Yosuke Ueno, Masaaki Kondo, Masamitsu Tanaka, Yasunari Suzuki, Yutaka Tabuchi:
QULATIS: A Quantum Error Correction Methodology toward Lattice Surgery. 274-287 - Gokul Subramanian Ravi, Kaitlin N. Smith, Pranav Gokhale, Andrea Mari, Nathan Earnest, Ali Javadi-Abhari, Frederic T. Chong:
VAQEM: A Variational Approach to Quantum Error Mitigation. 288-303 - Cheng Tan, Nicolas Bohm Agostini, Tong Geng, Chenhao Xie, Jiajia Li, Ang Li, Kevin J. Barker, Antonino Tumeo:
DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs. 304-316 - Jeong-Jun Lee, Wenrui Zhang, Peng Li:
Parallel Time Batching: Systolic-Array Acceleration of Sparse Spiking Neural Computation. 317-330 - Zhengrong Wang, Jian Weng, Sihao Liu, Tony Nowatzki:
Near-Stream Computing: General and Transparent Near-Cache Acceleration. 331-345 - Lutan Zhao, Peinan Li, Rui Hou, Michael C. Huang, Xuehai Qian, Lixin Zhang, Dan Meng:
HyBP: Hybrid Isolation-Randomization Secure Branch Predictor. 346-359 - Mehrnoosh Raoufi, Youtao Zhang, Jun Yang:
IR-ORAM: Path Access Type Based Memory Intensity Reduction for Path-ORAM. 360-372 - Ali Fakhrzadehgan, Yale N. Patt, Prashant J. Nair, Moinuddin K. Qureshi:
SafeGuard: Reducing the Security Risk from Row-Hammer via Low-Cost Integrity Protection. 373-386 - Andrii Maksymov, Jason Nguyen, Vandiver Chaplin, Yun Seong Nam, Igor L. Markov:
Detecting Qubit-coupling Faults in Ion-trap Quantum Computers. 387-399 - Mohammad Reza Jokar, Richard Rines, Ghasem Pasandi, Haolin Cong, Adam Holmes, Yunong Shi, Massoud Pedram, Frederic T. Chong:
DigiQ: A Scalable Digital Controller for Quantum Computers Using SFQ Logic. 400-414 - Haipeng Zha, Naveen Kumar Katam, Massoud Pedram, Murali Annavaram:
HiPerRF: A Dual-Bit Dense Storage SFQ Register File. 415-428 - Cen Chen, Kenli Li, Yangfan Li, Xiaofeng Zou:
ReGNN: A Redundancy-Eliminated Graph Neural Networks Accelerator. 429-443 - Zhaoying Li, Dan Wu, Dhananjaya Wijerathne, Tulika Mitra:
LISA: Graph Neural Network based Portable Mapping on Spatial Accelerators. 444-459 - Haoran You, Tong Geng, Yongan Zhang, Ang Li, Yingyan Lin:
GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design. 460-474 - Shixuan Zheng, Xianjue Zhang, Leibo Liu, Shaojun Wei, Shouyi Yin:
Atomic Dataflow based Graph-Level Workload Orchestration for Scalable DNN Accelerators. 475-489 - Kazi Abu Zubair, David Mohaisen, Amro Awad:
Filesystem Encryption or Direct-Access for NVM Filesystems? Let's Have Both! 490-502 - Jui-Nan Yen, Yao-Ching Hsieh, Cheng-Yu Chen, Tseng-Yi Chen, Chia-Lin Yang, Hsiang-Yun Cheng, Yixin Luo:
Efficient Bad Block Management with Cluster Similarity. 503-513 - Xueliang Li, Shicong Hong, Junyang Chen, Guihai Yan, Kaishun Wu:
Using Psychophysics to Guide Power Adaptation for Input Methods on Mobile Architectures. 514-527 - Mohsin Shan, Omer Khan:
HD-CPS: Hardware-assisted Drift-aware Concurrent Priority Scheduler for Shared Memory Multicores. 528-542 - Vignesh Balaji, Brandon Lucia:
Improving Locality of Irregular Updates with Hardware Assisted Propagation Blocking. 543-557 - Ishan Shah, Akanksha Jain, Calvin Lin:
Effective Mimicry of Belady's MIN Policy. 558-572 - Zhi Gang Liu, Paul N. Whatmough, Yuhao Zhu, Matthew Mattina:
S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration. 573-586 - Teague Tomesh, Pranav Gokhale, Victory Omole, Gokul Subramanian Ravi, Kaitlin N. Smith, Joshua Viszlai, Xin-Chuan Wu, Nikos Hardavellas, Margaret Martonosi, Frederic T. Chong:
SupermarQ: A Scalable Quantum Benchmark Suite. 587-603 - Alen Sabu, Harish Patil, Wim Heirman, Trevor E. Carlson:
LoopPoint: Checkpoint-driven Sampled Simulation for Multi-threaded Applications. 604-618 - Zhijing Li, Yuwei Ye, Stephen Neuendorffer, Adrian Sampson:
Compiler-Driven Simulation of Reconfigurable Hardware Accelerators. 619-632 - Hunjun Lee, Chanmyeong Kim, Minseop Kim, Yujin Chung, Jangwoo Kim:
NeuroSync: A Scalable and Accurate Brain Simulator Using Safe and Efficient Speculation. 633-647 - Majid Jalili, Mattan Erez:
Reducing Load Latency with Cache Level Prediction. 648-661 - Diya Joseph, Juan L. Aragón, Joan-Manuel Parcerisa, Antonio González:
TCOR: A Tile Cache with Optimal Replacement. 662-675 - Preyesh Dalmia, Rohan Mahapatra, Matthew D. Sinclair:
Only Buffer When You Need To: Reducing On-chip GPU Traffic with Reconfigurable Local Atomic Buffers. 676-691 - Hanrui Wang, Yongshan Ding, Jiaqi Gu, Yujun Lin, David Z. Pan, Frederic T. Chong, Song Han:
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits. 692-708 - Ji Liu, Peiyi Li, Huiyang Zhou:
Not All SWAPs Have the Same Cost: A Case for Optimization-Aware Qubit Routing. 709-725 - Yilun Zhao, Yanan Guo, Yuan Yao, Amanda Dumi, Devin M. Mulvey, Shiv Upadhyay, Youtao Zhang, Kenneth D. Jordan, Jun Yang, Xulong Tang:
Q-GPU: A Recipe of Optimizations for Quantum Circuit Simulation Using GPUs. 726-740 - Hanchen Ye, Cong Hao, Jianyi Cheng, Hyunmin Jeong, Jack Huang, Stephen Neuendorffer, Deming Chen:
ScaleHLS: A New Scalable High-Level Synthesis Framework on Multi-Level Intermediate Representation. 741-755 - Nicolai Oswald, Vijay Nagarajan, Daniel J. Sorin, Vasilis Gavrielatos, Theo Olausson, Reece Carr:
HeteroGen: Automatic Synthesis of Heterogeneous Cache Coherence Protocols. 756-771 - Ajeya Naithani, Lieven Eeckhout:
Reliability-Aware Runahead. 772-785 - Cristóbal Ramírez Lazo, Enrico Reggiani, Carlos Rojas Morales, Roger Figueras Bagué, Luis A. Villa Vargas, Marco Antonio Ramírez Salinas, Mateo Valero Cortés, Osman Sabri Unsal, Adrián Cristal:
Adaptable Register File Organization for Vector Processors. 786-799 - Han Zhao, Weihao Cui, Quan Chen, Youtao Zhang, Yanchao Lu, Chao Li, Jingwen Leng, Minyi Guo:
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS. 800-813 - Sheng-Chun Kao, Tushar Krishna:
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores. 814-830 - Yuan Li, Ahmed Louri, Avinash Karanth:
SPACX: Silicon Photonics-based Scalable Chiplet Accelerator for DNN Inference. 831-845 - Sai Qian Zhang, Bradley McDanel, H. T. Kung:
FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding. 846-860 - Jong Hoon Shin, Ali Shafiee, Ardavan Pedram, Hamzah Abdel-Aziz, Ling Li, Joseph Hassoun:
Griffin: Rethinking Sparse Optimization for Deep Learning Architectures. 861-875 - Sumanth Gudaparthi, Sarabjeet Singh, Surya Narayanan, Rajeev Balasubramonian, Visvesh Sathe:
CANDLES: Channel-Aware Novel Dataflow-Microarchitecture Co-Design for Low Energy Sparse Neural Network Acceleration. 876-891 - Sujay Yadalam, Nisarg Shah, Xiangyao Yu, Michael M. Swift:
ASAP: A Speculative Approach to Persistence. 892-907 - Yuanchao Xu, Chencheng Ye, Xipeng Shen, Yan Solihin:
Temporal Exposure Reduction Protection for Persistent Memory. 908-924 - Adnan Maruf, Ashikee Ghosh, Janki Bhimani, Daniel Campello, Andy Rudoff, Raju Rangaswami:
MULTI-CLOCK: Dynamic Tiering for Hybrid Memory Systems. 925-937 - Lillian Pentecost, Alexander Hankin, Marco Donato, Mark Hempstead, Gu-Yeon Wei, David Brooks:
NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories. 938-956 - Hossein Farrokhbakht, Paul V. Gratz, Tushar Krishna, Joshua San Miguel, Natalie D. Enright Jerger:
Stay in your Lane: A NoC with Low-overhead Multi-packet Bypassing. 957-970 - Ahsen Ejaz, Ioannis Sourdis:
FastTrackNoC: A NoC with FastTrack Router Datapaths. 971-985 - Yibo Wu, Liang Wang, Xiaohang Wang, Jie Han, Jianfeng Zhu, Honglan Jiang, Shouyi Yin, Shaojun Wei, Leibo Liu:
Upward Packet Popup for Deadlock Freedom in Modular Chiplet-Based Systems. 986-1000 - Mike O'Connor, Donghyuk Lee, Niladrish Chatterjee, Michael B. Sullivan, Stephen W. Keckler:
Saving PAM4 Bus Energy with SMOREs: Sparse Multi-level Opportunistic Restricted Encodings. 1001-1013 - Xia Zhao, Lieven Eeckhout, Magnus Jahre:
Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures. 1014-1028 - Yu Huang, Long Zheng, Pengcheng Yao, Qinggang Wang, Xiaofei Liao, Hai Jin, Jingling Xue:
Accelerating Graph Convolutional Networks Using Crossbar-based Processing-In-Memory Architectures. 1029-1042 - Xingchen Li, Bingzhe Wu, Guangyu Sun, Zhe Zhang, Zhihang Yuan, Runsheng Wang, Ru Huang, Dimin Niu, Hongzhong Zheng, Zhichao Lu, Liang Zhao, Meng-Fan Marvin Chang, Tianchan Guan, Xin Si:
Enabling High-Quality Uncertainty Quantification in a PIM Designed for Bayesian Neural Network. 1043-1055 - Xuan Sun, Hu Wan, Qiao Li, Chia-Lin Yang, Tei-Wei Kuo, Chun Jason Xue:
RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference. 1056-1070 - Minxuan Zhou, Weihong Xu, Jaeyoung Kang, Tajana Rosing:
TransPIM: A Memory-based Acceleration via Software-Hardware Co-Design for Transformer. 1071-1085 - Shuang Chen, Yi Jiang, Christina Delimitrou, José F. Martínez:
PIMCloud: QoS-Aware Resource Management of Latency-Critical Applications in Clouds with Processing-in-Memory. 1086-1099 - Jinkwon Kim, Mincheol Kang, Jeongkyu Hong, Soontae Kim:
Exploiting Inter-block Entropy to Enhance the Compressibility of Blocks with Diverse Data. 1100-1114 - Alexandra Angerd, Angelos Arelakis, Vasilis Spiliopoulos, Erik Sintorn, Per Stenström:
GBDI: Going Beyond Base-Delta-Immediate Compression with Global Bases. 1115-1127 - Stephen Longofono, Seyed Mohammad Seyedzadeh, Alex K. Jones:
Virtual Coset Coding for Encrypted Non-Volatile Memories with Multi-Level Cells. 1128-1140 - F. Nisa Bostanci, Ataberk Olgun, Lois Orosa, Abdullah Giray Yaglikçi, Jeremie S. Kim, Hasan Hassan, Oguz Ergin, Onur Mutlu:
DR-STRaNGe: End-to-End System Design for DRAM-based True Random Number Generators. 1141-1155 - Michael Jaemin Kim, Jaehyun Park, Yeonhong Park, Wanju Doh, Namhoon Kim, Tae Jun Ham, Jae W. Lee, Jung Ho Ahn:
Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh. 1156-1169 - Jawad Haj-Yahya, Jeremie S. Kim, Abdullah Giray Yaglikçi, Jisung Park, Efraim Rotem, Yanos Sazeides, Onur Mutlu:
DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors. 1170-1183 - Sana Damani, Mark Stephenson, Ram Rangan, Daniel R. Johnson, Rishkul Kulkami, Stephen W. Keckler:
GPU Subwarp Interleaving. 1184-1197 - Tianqi Wang, Fan Feng, Shaolin Xiang, Qi Li, Jing Xia:
Application Defined On-chip Networks for Heterogeneous Chiplets: An Implementation Perspective. 1198-1210 - Keun Sup Shim, Brian Greskamp, Brian Towles, Bruce Edwards, J. P. Grossman, David E. Shaw:
The Specialized High-Performance Network on Anton 3. 1211-1223 - Baolin Li, Rohin Arora, Siddharth Samsi, Tirthak Patel, William Arcand, David Bestor, Chansup Byun, Rohan Basu Roy, Bill Bergeron, John T. Holodnak, Michael Houle, Matthew Hubbell, Michael Jones, Jeremy Kepner, Anna Klein, Peter Michaleas, Joseph McDonald, Lauren Milechin, Julie Mullen, Andrew Prout, Benjamin Price, Albert Reuther, Antonio Rosa, Matthew L. Weiss, Charles Yee, Daniel Edelman, Allan Vanterpool, Anson Cheng, Vijay Gadepally, Devesh Tiwari:
AI-Enabling Workloads on Large-Scale GPU-Accelerated System: Characterization, Opportunities, and Implications. 1224-1237
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.