default search action
32nd IPDPS 2018: Vancouver, BC, Canada - Workshops
- 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2018, Vancouver, BC, Canada, May 21-25, 2018. IEEE Computer Society 2018, ISBN 978-1-5386-5555-9
HCW: Heterogeneity in Computing Workshop
- Alexey L. Lastovetsky, Sudeep Pasricha:
Introduction to HCW 2018. 1 - Behrooz A. Shirazi:
Message from the HCW Steering Committee Chair. 2 - Alexey L. Lastovetsky:
Message from the HCW General Chair. 3 - Sudeep Pasricha:
Message from the HCW Program Committee Chair. 4 - Manish Parashar:
HCW 2018 Keynote Talk 1. 5 - Ümit V. Çatalyürek:
HCW 2018 Keynote Talk 2. 6
Session 1: Reconfigurable and Cloud Systems
- Leslie Barron, Tarek S. Abdelrahman:
User-Transparent Translation of Machine Instructions to Programmable Hardware. 7-14 - Yves Caniou, Eddy Caron, Aurélie Kong Win Chang, Yves Robert:
Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on Heterogeneous IaaS Cloud Platforms. 15-26 - Zheming Jin, Hal Finkel:
Optimizing Parallel Reduction on OpenCL FPGA Platform - A Case Study of Frequent Pattern Compression. 27-35
Session 2: Workload Scheduling and Architecture Analysis
- Massinissa Ait Aba, Lilia Zaourar, Alix Munier:
Approximation Algorithm for Scheduling Applications on Hybrid Multi-core Machines with Communications Delays. 36-45 - Sean Pennefather, Karen L. Bradshaw, Barry Irwin:
Exploration and Design of a Synchronous Message Passing Framework for a CPU-NPU Heterogeneous Architecture. 46-56 - Fei Lei, Lei Yu, Bing Shao, Fei Teng, Bo Zhou:
Large Scale Data Centers Simulation Based on Baseline Test Model. 57-68 - Anke Kreuzer, Norbert Eicker, Jorge Amaya, Estela Suarez:
Application Performance on a Cluster-Booster System. 69-78
RAW: Reconfigurable Architectures Workshop
- Marco D. Santambrogio, Diana Goehringer, Dirk Stroobandt, Ken Eguro:
Introduction to RAW 2018. 79-80 - Jürgen Becker, Viktor K. Prasanna, Markus Weimer, Wayne Luk, Kaveh Aasaraai, Derek Chiou:
RAW 2018 Invited Talks. 81-82
Session 1: Platforms and Memory
- Pekka Jääskeläinen, Aleksi Tervo, Guillermo Payá Vayá, Timo Viitanen, Nicolai Behmann, Jarmo Takala, Holger Blume:
Transport-Triggered Soft Cores. 83-90 - Francesco Peverelli, Marco Rabozzi, Emanuele Del Sozzo, Marco D. Santambrogio:
OXiGen: A Tool for Automatic Acceleration of C Functions Into Dataflow FPGA-Based Kernels. 91-98 - William E. Allcock, Bennett Bernardoni, Colleen Bertoni, Neil Getty, Joseph A. Insley, Michael E. Papka, Silvio Rizzi, Brian R. Toonen:
RAM as a Network Managed Resource. 99-106 - Catalin Bogdan Ciobanu, Giulio Stramondo, Cees de Laat, Ana Lucia Varbanescu:
MAX-PolyMem: High-Bandwidth Polymorphic Parallel Memories for DFEs. 107-114
Session 2: Applications
- Enrico Reggiani, Giuseppe Natale, Carlo Moroni, Marco D. Santambrogio:
An FPGA-Based Acceleration Methodology and Performance Model for Iterative Stencils. 115-122 - Hamid Reza Zohouri, Artur Podobas, Satoshi Matsuoka:
High-Performance High-Order Stencil Computation on FPGAs Using OpenCL. 123-130 - Alessandro Comodi, Davide Conficconi, Alberto Scolari, Marco D. Santambrogio:
TiReX: Tiled Regular eXpression Matching Architecture. 131-137 - Artur Podobas, Satoshi Matsuoka:
Hardware Implementation of POSITs and Their Application in FPGAs. 138-145
Session 3: Machine Learning 1
- Luca Cerina, Giuseppe Franco, Pierandrea Cancian, Marco D. Santambrogio:
Robustness of Surface EMG Classifiers with Fixed-Point Decomposition on Reconfigurable Architecture. 146-153 - Florian Kastner, Benedikt Janßen, Frederik Kautz, Michael Hübner, Giulio Corradi:
Hardware/Software Codesign for Convolutional Neural Networks Exploiting Dynamic Partial Reconfiguration on PYNQ. 154-161 - Chaim Baskin, Natan Liss, Evgenii Zheltonozhskii, Alexander M. Bronstein, Avi Mendelson:
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform. 162-169
Session 4: Machine Learning 2
- Niccolo Raspa, Giuseppe Natale, Marco Bacis, Marco D. Santambrogio:
A Framework with Cloud Integration for CNN Acceleration on FPGA Devices. 170-177 - Daniel Holanda Noronha, Philip Heng Wai Leong, Steven J. E. Wilton:
Kibo: An Open-Source Fixed-Point Tool-kit for Training and Inference in FPGA-Based Deep Learning Networks. 178-185 - Menbere Kina Tekleyohannes, Christian Weis, Norbert Wehn, Martin Klein, Michael Siegrist:
A Reconfigurable Accelerator for Morphological Operations. 186-193
Session 5: Short Papers 1
- Syed Waqar Nabi, Wim Vanderbauwhede:
MP-STREAM: A Memory Performance Benchmark for Design Space Exploration on Heterogeneous HPC Devices. 194-197 - Luca Stornaiuolo, Alberto Parravicini, Donatella Sciuto, Marco D. Santambrogio:
FIDA: A Framework to Automatically Integrate FPGA Kernels Within Data-Science Applications. 198-201 - Tien Thanh Nguyen, Mathieu Thevenin, Anthony Mouraud, Gwenolé Corre, Olivier Pasquier, Sébastien Pillement:
High-Level Reliability Evaluation of Reconfiguration-Based Fault Tolerance Techniques. 202-205 - Florian Oszwald, Jürgen Becker, Philipp Obergfell, Matthias Traub:
Dynamic Reconfiguration for Real-Time Automotive Embedded Systems in Fail-Operational Context. 206-209
Session 6: Short Papers 2
- Peter Rouget, Benoît Badrignans, Pascal Benoit, Lionel Torres:
FPGA Implementation of Pattern Matching for Industrial Control Systems. 210-213 - Lorenzo Di Tucci, Davide Conficconi, Alessandro Comodi, Steven A. Hofmeyr, David Donofrio, Marco D. Santambrogio:
A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA Using Chisel HCL. 214-217 - Ayan Palchaudhuri, Anindya Sundar Dhar:
Redundant Binary to Two's Complement Converter on FPGAs Through Fabric Aware Scan Based Encoding Approach for Fault Localization Support. 218-221 - Matthias Goebel, Ilja Behnke, Ahmed Elhossini, Ben H. H. Juurlink:
An Application-Specific Memory Management Unit for FPGA-SoCs. 222-225
HiCOMB: High Performance Computational Biology
- Srinivas Aluru, David A. Bader, Paul Medvedev:
Introduction to HiCOMB 2018. 226 - James Taylor:
HiCOMB Keynote 1. 227 - Onur Mutlu:
HICOMB Keynote 2. 228 - Golnar Sheikhshab, Elizabeth Starks, Aly Karsan, Readman Chiu, Anoop Sarkar, Inanç Birol:
GraphNER: Using Corpus Level Similarities and Graph Propagation for Named Entity Recognition. 229-238 - William Arndt:
Modifying HMMER3 to Run Efficiently on the Cori Supercomputer Using OpenMP Tasking. 239-246 - Daniel L. Ayres, Michael P. Cummings:
Rerooting Trees Increases Opportunities for Concurrent Computation and Results in Markedly Improved Performance for Phylogenetic Inference. 247-256 - Raja Appuswamy, Jacques Fellay, Nimisha Chaturvedi:
Sequence Alignment Through the Looking Glass. 257-266
GABB: Graph Algorithms Building Blocks
- Tim Mattson:
Introduction to GABB 2018. 267
Keynote Session
- John R. Gilbert:
Graph Algorithms in the Language of Linear Algebra: How Did We Get Here, and Where Do We Go Next? 268 - Shad Kirmani, Kamesh Madduri:
Spectral Graph Drawing: Building Blocks and Performance Analysis. 269-277
Session 1: Generating Graphs with Known Properties
- Anil Kumar S. Vullikanti:
Parallel Generation of Large-Scale Random Graphs. 278 - Jeremy Kepner, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Tim Davis, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Anna Klein, Peter Michaleas, Roger Pearce, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Geoffrey Sanders, Charles Yee, Albert Reuther:
Design, Generation, and Validation of Extreme Scale Power-Law Graphs. 279-286 - Geoffrey Sanders, Roger Pearce, Timothy La Fond, Jeremy Kepner:
On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices. 287-296
Session 2: GraphBLAS Implementations
- Scott McMillan:
Patterns of GraphBLAS Algorithms: Tales from the Trenches. 297 - José E. Moreira, Manoj Kumar, William P. Horn:
Implementing the GraphBLAS C API. 298-309 - Jesse Chamberlin, Marcin Zalewski, Scott McMillan, Andrew Lumsdaine:
PyGB: GraphBLAS DSL in Python with Dynamic Compilation Into Efficient C++. 310-319
Session 3: Graph Building Blocks Community Meeting
- Chris Long:
A Survey of Modern Analysis on Graphs: Open Problems. 320
EduPar: NSF/TCPP W. on Parallel and Distributed Computing Education
- Martina Barnas, Sushil K. Prasad, Satish Puri:
Introduction to EduPar 2018. 321-322 - Alexandru Iosup:
EduPar 2018 Keynote. 323
EduPar Session 1
- Marin Abernethy, Oliver Sinnen, Joel C. Adams, Giuseppe De Ruvo, Nasser Giacaman:
ParallelAR: An Augmented Reality App and Instructional Approach for Learning Parallel Programming Scheduling Concepts. 324-331 - Devangi N. Parikh, Jianyu Huang, Margaret E. Myers, Robert A. van de Geijn:
Learning from Optimizing Matrix-Matrix Multiplication. 332-339 - Emanuel Buzek, Martin Krulis:
An Entertaining Approach to Parallel Programming Education. 340-346 - Sunny Raj, Sumit Kumar Jha:
Predicting Success in Undergraduate Parallel Programming via Probabilistic Causality Analysis. 347-352
EduPar Session 2
- Jawwad Ahmed Shamsi, Syed Zain ul Hassan, Narmeen Zakaria Bawany, Nausheen Shoaib:
A Comprehensive Course on Big Data for Undergraduate Students. 353-360 - Erik Saule:
Experiences on Teaching Parallel and Distributed Computing for Undergraduates. 361-368 - Mohammad Amin Kuhail, Spencer Cook, Joshua W. Neustrom, Praveen Rao:
Teaching Parallel Programming with Active Learning. 369-376 - Debzani Deb, Sebastian Cousins, M. Muztaba Fuad:
Teaching Big Data and Cloud Computing: A Modular Approach. 377-383
HIPS: High Level Programming Models and Supportive Environments
- Karl Fuerlinger, Philip C. Roth:
Introduction to HIPS 2018. 384-385 - Christian Trott:
HIPS 2018 Keynote. 386
Session 1: Tool Support for Parallel Programming Environments
- Hartmut Mix, Christian Herold, Matthias Weber:
Visualization of Multi-layer I/O Performance in Vampir. 387-394 - Simone Atzeni, Ganesh Gopalakrishnan:
An Operational Semantic Basis for Building an OpenMP Data Race Checker. 395-404 - Mostafa Mehrabi, Nasser Giacaman, Oliver Sinnen:
Unobtrusive Support for Asynchronous GUI Operations with Java Annotations. 405-414
Session 2: Distributed Memory and Task-Based Programming
- Hongbo Li, Zizhong Chen, Rajiv Gupta, Min Xie:
Non-intrusively Avoiding Scaling Problems in and out of MPI Collectives. 415-424 - Bernie van Veen, Sung-Shik Jongmans:
Modular Programming of Synchronization and Communication Among Tasks in Parallel Programs. 425-435 - Matthew Whitlock, Hemanth Kolla, Sean Treichler, Philippe P. Pébay, Janine C. Bennett:
Scalable Collectives for Distributed Asynchronous Many-Task Runtimes. 436-445
HPBDC: High-Performance Big Data, Deep Learning, and Cloud Computing
- Xiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda:
Introduction to HPBDC 2018. 446 - Geoffrey C. Fox:
HPBDC 2018 Keynote. 447
Regular Paper Session 1: High-Performance Data Processing Systems
- Felix Seibert, Mathias Peters, Florian Schintke:
Improving I/O Performance Through Colocating Interrelated Input Data and Near-Optimal Load Balancing. 448-457 - Tanuj kr Aasawat, Tahsin Reza, Matei Ripeanu:
How Well do CPU, GPU and Hybrid Graph Processing Frameworks Perform? 458-466 - Can Wu, Xiaoning Wang, Haili Xiao, Rongqiang Cao, Yining Zhao, Xuebin Chi:
EASIS: An Optimized Information Service for High Performance Computing Environment. 467-476
Regular Paper Session 2: High-Performance Data Processing Applications
- Michael Gowanlock, Ben Karsin:
GPU Accelerated Self-Join for the Distance Similarity Metric. 477-486 - Jun Chen, Peigang Zou:
Implementing a Parallel Graph Clustering Algorithm with Sparse Matrix Computation. 487-496 - Christopher Harrison, Sündüz Keles, Rebecca Hudson, Sunyoung Shin, Inês Dutra:
atSNPInfrastructure, a Case Study for Searching Billions of Records While Providing Significant Cost Savings over Cloud Providers. 497-506
Short Paper Session 1: Data Processing on HPC and Cloud Environments
- Yining Zhao, Xiaodong Wang, Haili Xiao, Xuebin Chi:
Improvement of the Log Pattern Extracting Algorithm Using Text Similarity. 507-514 - Xu Chang, Li Zha:
The Performance Analysis of Cache Architecture Based on Alluxio over Virtualized Infrastructure. 515-519
AsHES: Accelerators and Hybrid Exascale Systems
- Sunita Chandrasekaran, Antonio J. Peña, Min Si:
Introduction to AsHES 2018. 520 - Michael Wolfe:
AsHES 2018 Keynote. 521
Session 1: Runtime Scheduling and Performance Analytics
- Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter:
NVIDIA Tensor Core Programmability, Performance & Precision. 522-531 - Zheming Jin, Hal Finkel:
Optimizing an Atomics-Based Reduction Kernel on OpenCL FPGA Platform. 532-539 - Osman Seckin Simsek, Andi Drebes, Antoniu Pop:
Leveraging Data-Flow Task Parallelism for Locality-Aware Dynamic Scheduling on Heterogeneous Platforms. 540-549
Session 2: Algorithms and Applications
- Kyungjoo Kim, H. Carter Edwards, Sivasankaran Rajamanickam:
Tacho: Memory-Scalable Task Parallel Sparse Cholesky Factorization. 550-559 - Michael Gowanlock, Ben Karsin:
Sorting Large Datasets with Heterogeneous CPU/GPU Architectures. 560-569 - Shaolong Chen, Miquel A. Senar:
Improving Performance of Genomic Aligners on Intel Xeon Phi-Based Architectures. 570-578
Session 3: Emerging Accelerator Architectures
- Eric R. Hein, Tom Conte, Jeffrey Young, Srinivas Eswar, Jiajia Li, Patrick Lavin, Richard W. Vuduc, E. Jason Riedy:
An Initial Characterization of the Emu Chick. 579-588 - Sergio Rivas-Gomez, Antonio J. Peña, David Moloney, Erwin Laure, Stefano Markidis:
Exploring the Vision Processing Unit as Co-Processor for Inference. 589-598
PDCO: Parallel / Distributed Computing and Optimization
- Grégoire Danoy, Didier El Baz, Vincent Boyer, Bernabé Dorronsoro:
Introduction to PDCO 2018. 599-600
Session 1: Scheduling, Parallel Genetic Algorithms, Genetic Programming
- Jheisson López, Danny Munera, Daniel Diaz, Salvador Abreu:
On Integrating Population-Based Metaheuristics with Cooperative Parallelism. 601-608 - Emmanuel Kieffer, Grégoire Danoy, Pascal Bouvry, Anass Nagih:
A Competitive Approach for Bi-Level Co-Evolution. 609-618 - Yuanzhe Li, Laleh Ghalami, Loren Schwiebert, Daniel Grosu:
A GPU Parallel Approximation Algorithm for Scheduling Parallel Identical Machines to Minimize Makespan. 619-628 - Jia Luo, Didier El Baz:
A Survey on Parallel Genetic Algorithms for Shop Scheduling Problems. 629-636
Session 2: Parallel Distributed Computing Systems and Optimization, Applications
- Md. Naim, Fredrik Manne:
Scalable b-Matching on GPUs. 637-646 - Richard Neill, Andi Drebes, Antoniu Pop:
Automated Analysis of Task-Parallel Execution Behavior Via Artificial Neural Networks. 647-656 - Thanasis Loukopoulos, Nikos Tziritas, Maria G. Koziri, George I. Stamoulis, Samee U. Khan, Cheng-Zhong Xu, Albert Y. Zomaya:
Data Stream Processing at Network Edges. 657-665 - Andrei Tchernykh, Mikhail G. Babenko, Vanessa Miranda-López, Alexander Yu. Drozdov, Arutyun Avetisyan:
WA-RRNS: Reliable Data Storage System Based on Multi-cloud. 666-673
HPPAC: High-Performance, Power-Aware Computing
- Shuaiwen Leon Song, Natalie J. Bates, Ang Li:
Introduction to HPPAC 2018. 674 - Gregory A. Koenig:
HPPAC 2018 Keynote. 675 - Rolando Brondolin, Tommaso Sardelli, Marco D. Santambrogio:
DEEP-Mon: Dynamic and Energy Efficient Power Monitoring for Container-Based Infrastructures. 676-684 - Matthias Maiterth, Gregory A. Koenig, Kevin T. Pedretti, Siddhartha Jana, Natalie J. Bates, Andrea Borghesi, Dave Montoya, Andrea Bartolini, Milos Puzovic:
Energy and Power Aware Job Scheduling and Resource Management: Global Survey - Initial Analysis. 685-693 - Sean Rea, Ehsan Atoofian:
Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching. 694-701 - Satyabrata Sen, Neena Imam, Chung-Hsing Hsu:
Quality Assessment of GPU Power Profiling Mechanisms. 702-711 - Thomas Ilsche, Robert Schöne, Philipp Joram, Mario Bielert, Andreas Gocht:
System Monitoring with lo2s: Power and Runtime Impact of C-State Transitions. 712-715 - Zheming Jin, Hal Finkel:
Power and Performance Tradeoff of a Floating-Point Intensive Kernel on OpenCL FPGA Platform. 716-720 - Vignesh Adhinarayanan, Bishwajit Dutta, Wu-chun Feng:
Making a Case for Green High-Performance Visualization Via Embedded Graphics Processors. 721-724 - Kevin T. Pedretti, Ryan E. Grant, James H. Laros III, Michael J. Levenhagen, Stephen L. Olivier, Lee Ward, Andrew J. Younge:
A Comparison of Power Management Mechanisms: P-States vs. Node-Level Power Cap Control. 725-729
APDCM: Advances in Parallel and Distributed Computational Models
- Oscar H. Ibarra, Koji Nakano, Akihiro Fujiwara, Susumu Matsumae:
Introduction to APDCM 2018. 730-731 - Yuji Shinano:
APDCM 2018 Keynote. 732
Session 1: Parallel Computing Models
- Yan Gu:
Survey: Computational Models for Asymmetric Read and Write Costs. 733-743 - Martti Forsell, Jussi Roivainen, Ville Leppänen, Jesper Larsson Träff:
Implementation of Multioperations in Thick Control Flow Processors. 744-752 - Anup Zope, Edward Luke:
A Block Streaming Model for Irregular Applications. 753-762 - Yutaro Emoto, Shunji Funasaka, Hiroki Tokura, Takumi Honda, Koji Nakano, Yasuaki Ito:
An Optimal Parallel Algorithm for Computing the Summed Area Table on the GPU. 763-772
Session 2: Concurrency Models
- Alex Aravind:
Barrier Synchronization: Simplified, Generalized, and Solved Without Mutual Exclusion. 773-782 - Daniel Dauwe, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel:
An Analysis of Multilevel Checkpoint Performance Models. 783-792 - Anne Benoit, Aurélien Cavelan, Florina M. Ciorba, Valentin Le Fèvre, Yves Robert:
Combining Checkpointing and Replication for Reliable Execution of Linear Workflows. 793-802 - Thomas Hérault, Yves Robert, Aurélien Bouteiller, Dorian C. Arnold, Kurt B. Ferreira, George Bosilca, Jack J. Dongarra:
Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms. 803-812
Session 3: Distributed Computing and Wireless Networks
- Hiroto Yasumi, Naoki Kitamura, Fukuhito Ooshita, Taisuke Izumi, Michiko Inoue:
A Population Protocol for Uniform k-Partition Under Global Fairness. 813-819 - Satoshi Fujita:
On the Cost of Cloud-Assistance in Tree-Structured P2P Live Streaming. 820-828 - Gokarna Sharma:
Mutual Visibility for Robots with Lights Tolerating Light Faults. 829-836 - Wei Chen, Liang Hong, Sudeep Bhattarai, Tony Sanchez, Ebholo Ijieh, Stacie Severyn, Leonard E. Lightfoot:
Joint Cooperative Protocols and Distributed Beamforming Design with Efficient Secondary User Selection for Multi-hop Cognitive Radio Networks. 837-844 - Chaofan Duan, Jing Feng, Haotian Chang, Bin Song, Zhikang Xu:
A Novel Handover Control Strategy Combined with Multi-hop Routing in LEO Satellite Networks. 845-851
ParLearning: Parallel and Distributed Computing for Large-Scale Machine Learning and Big Data Analytics
- Henri E. Bal, Arindam Pal, Azalia Mirhoseini, Thomas P. Parnell:
Introduction to ParLearning 2018. 852-853 - Abhinav Vishnu:
ParLearning 2018 Invited Talk 1. 854 - Azalia Mirhoseini:
ParLearning 2018 Invited Talk 2. 855 - Thomas P. Parnell:
ParLearning 2018 Invited Talk 3. 856 - Songze Li, Seyed Mohammadreza Mousavi Kalan, Amir Salman Avestimehr, Mahdi Soltanolkotabi:
Near-Optimal Straggler Mitigation for Distributed Gradient Methods. 857-866 - Nesma M. Rezk, Madhura Purnaprajna, Zain Ul-Abdin:
Streaming Tiles: Flexible Implementation of Convolution Neural Networks Inference on Manycore Architectures. 867-876 - Seungyo Ryu, Dongseung Kim:
Parallel Huge Matrix Multiplication on a Cluster with GPGPU Accelerators. 877-882 - Elizaveta Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels, Xiaoye Sherry Li:
A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression. 883-892
CHIUW: Chapel Implementers and Users Workshop
- Michael Ferguson, Nikhil Padmanabhan, Brad Chamberlain:
Introduction to CHIUW 2018. 893-894 - Katherine A. Yelick:
CHIUW 2018 Keynote. 895
Session 1: Introduction
Session 2: Applications of Chapel
- Thomas B. Rolinger, Tyler A. Simon, Christopher D. Krieger:
Parallel Sparse Tensor Decomposition in Chapel. 896-905 - Daniel A. Feshbach, Mary Glaser, Michelle Strout, David G. Wonnacott:
Iterator-Based Optimization of Imperfectly-Nested Loops. 906-914 - Apan Qasem, Ashwin M. Aji, Michael L. Chu:
Investigating Data Layout Transformations in Chapel. 915-924
Session 3: Chapel Design and Evolution
- Louis Jenkins:
RCUArray: An RCU-Like Parallel-Safe Distributed Resizable Array. 925-933
Session 4: Chapel Performance
Session 5: Tools
- Richard B. Johnson, Jeffrey K. Hollingsworth:
Purity: An Integrated, Fine-Grain, Data-Centric, Communication Profiler for the Chapel Language. 934-942
PDSEC: Parallel and Distributed Scientific and Engineering Comp
- Peter Strazdins, Keita Teranishi, Raphaël Couturier, Joseph Antony, Thomas Rauber, Gudula Rünger, Laurence T. Yang:
Introduction to PDSEC 2018 and Keynotes. 943-946
Session 1: Best Paper
- Matthias Noack, Alexander Reinefeld, Tobias Kramer, Thomas Steinke:
DM-HEOM: A Portable and Scalable Solver-Framework for the Hierarchical Equations of Motion. 947-956
Session 2: Parallel Computational Techniques
- Haruka Yamada, Akira Imakura, Toshiyuki Imamura, Tetsuya Sakurai:
Optimization of Reordering Procedures in HOTRG for Distributed Parallel Computing. 957-966 - Thomas Rauber, Gudula Rünger:
Energy and Performance Improvement of Parallel ODE Solvers by Application-Specific Program Transformations. 967-976 - Phillip M. Dickens, Christopher Dufour, James Fastook:
The Scalability of Embedded Structured Grids and Unstructured Grids in Large Scale Ice Sheet Modeling on Distributed Memory Parallel Computers. 977-986 - Joseph M. Myre, Erich Frahm, David J. Lilja, Martin O. Saar:
TNT: A Solver for Large Dense Least-Squares Problems that Takes Conjugate Gradient from Bad in Theory, to Good in Practice. 987-995
Session 3: Invited Talks/Systems
- Chung Lee, Peter Strazdins:
An Energy-Efficient Asymmetric Multi-Processor for HPC Virtualization. 996-1005
Session 4: Accelerators
- Zhang Yang, Damodar Sahasrabudhe, Alan Humphrey, Martin Berzins:
A Preliminary Port and Evaluation of the Uintah AMT Runtime on Sunway TaihuLight. 1006-1015 - Pacôme Eberhart, Baptiste Landreau, Julien Brajard, Pierre Fortin, Fabienne Jézéquel:
Improving CADNA Performance on GPUs. 1016-1025 - Zheming Jin, Hal Finkel:
Evaluation of MD5Hash Kernel on OpenCL FPGA Platform. 1026-1032 - Albert Farrés, Claudia Rosas, Mauricio Hanzich, Alejandro Duran, Charles Yount:
Performance Optimization of Fully Anisotropic Elastic Wave Propagation on 2nd Generation Intel® Xeon Phi(TM) Processors. 1033-1042
JSSPP: Job Scheduling Strategies for Parallel Processing
- Walfredo Cirne, Narayan Desai, Dalibor Klusácek:
Introduction to JSSPP 2018. 1043-1044 - John Wilkes:
JSSPP 2018 Keynote. 1045-1046
iWAPT: International Workshop on Automatic Performance Tunings
- Osni Marques, Reiji Suda, Jakub Kurzak, Akihiro Fujii:
Introduction to iWAPT 2018. 1047
Session 1: Machine Learning
- Sarah Knepper:
iWAPT 2018 Invited Speaker 1. 1048 - Yuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa:
Use of Code Structural Features for Machine Learning to Predict Effective Optimizations. 1049-1055 - Israt Nisa, Charles Siegel, Aravind Sukumaran-Rajam, Abhinav Vishnu, P. Sadayappan:
Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs. 1056-1065
Session 2: AT Techniques
- David E. Tanner:
Tensile: Auto-Tuning GEMM GPU Assembly for All Problem Sizes. 1066-1075 - Timothy M. Platt, Zhiliu Yang, Chen Liu:
GreedyTalents: An Energy-Aware Auto-Tuning Method for Many-Core Processor. 1076-1083 - Takahiro Katagiri:
Auto-Tuning for the Era of Relatively High Bandwidth Memory Architectures: A Discussion Based on an FDM Application. 1084-1092 - Shuntaro Ichimura, Takahiro Katagiri, Katsuhisa Ozaki, Takeshi Ogita, Toru Nagai:
Threaded Accurate Matrix-Matrix Multiplications with Sparse Matrix-Vector Multiplications. 1093-1102
Session 3: Linear Algebra
- David E. Tanner:
iWAPT 2018 Invited Speaker 2. 1103 - Naoya Nomura, Akihiro Fujii, Teruo Tanaka, Osni Marques, Kengo Nakajima:
Algebraic Multigrid Solver Using Coarse Grid Aggregation with Independent Aggregation. 1104-1112 - Takeshi Fukaya, Toshiyuki Imamura, Yusaku Yamamoto:
A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K Computer. 1113-1122
Session 4: AT Methodologies
- David Pfander, Malte Brunn, Dirk Pflüger:
AutoTuneTMP: Auto-Tuning in C++ With Runtime Template Metaprogramming. 1123-1132 - Bibek Wagle, Samuel Kellar, Adrian Serio, Hartmut Kaiser:
Methodology for Adaptive Active Message Coalescing in Task Based Runtime Systems. 1133-1140
ParSocial: Parallel and Distributed Processing for Computational Social Systems
- Eunice E. Santos, John Korah:
Introduction to ParSocial 2018. 1141 - V. S. Subrahmanian:
ParSocial 2018 Keynote. 1142 - Anamitra Pal, Pavan Rangudu, S. S. Ravi, Anil Kumar S. Vullikanti:
Using Activity Patterns to Place Electric Vehicle Charging Stations in Urban Regions. 1143-1152 - Eunice E. Santos, John Korah, Vairavan Murugappan:
Handling Vertex Deletions in Memory Scalable Anytime Anywhere Algorithms for Large and Dynamic Social Networks. 1153-1162 - Bhavani Thuraisingham, Murat Kantarcioglu, Latifur Khan:
Integrating Cyber Security and Data Science for Social Media: A Position Paper. 1163-1165
GraML: Graph Algorithms and Machine Learning
- Antonino Tumeo, Mahantesh Halappanavar, John Feo, Assefaw Hadish Gebremedhin, Abhinav Vishnu:
Introduction to GraML 2018. 1166-1167 - Nesreen K. Ahmed:
GraML 2018 Keynote. 1168 - Ronald D. Hagan, Charles A. Phillips, Michael A. Langston, Bradley J. Rhodes:
Classification and Anomaly Detection in Traffic Patterns of New York City Taxis: A Case Study in Compound Analytics. 1169-1174 - Trong Duc Nguyen, Srikanta Tirthapura:
V2V: Vector Embedding of a Graph and Applications. 1175-1183 - Keyvan Sasani, Mohammad Hossein Namaki, Assefaw Hadish Gebremedhin:
Network Similarity Prediction in Time-Evolving Graphs: A Machine Learning Approach. 1184-1193 - Kathleen E. Hamilton, Catherine D. Schuman, Steven R. Young, Neena Imam, Travis S. Humble:
Neural Networks and Graph Algorithms with Next-Generation Processors. 1194-1203
CEBDA: Convergence of Extreme Scale Computing and Big Data Analysis
- Shadi Ibrahim, Manish Parashar, Anna Queralt, Domenico Talia:
Introduction to CEBDA 2018. 1204 - Franck Cappello:
CEBDA 2018 Keynote. 1205 - Olivier Beaumont, Thomas Lambert, Loris Marchal, Bastien Thomas:
Data-Locality Aware Dynamic Schedulers for Independent Tasks with Replicated Inputs. 1206-1213 - Thomas Marrinan, Silvio Rizzi, Joseph A. Insley, Brian R. Toonen, William E. Allcock, Michael E. Papka:
Transferring Data from High-Performance Simulations to Extreme Scale Analysis Applications in Real-Time. 1214-1220 - Fotios Nikolaidis, Nick Kossifidis, Thomas Leibovici, Soraya Zertal:
Towards a TRansparent I/O Solution. 1221-1228
MPP: Parallel Programming Model: Special Edition on Edge/Fog/In-Situ Computing
- Leandro A. J. Marzulo, Felipe Maia Galvão França, Cristiana Bentes, Gabriele Mencagli:
Introduction to MPP 2018. 1229-1230 - Vladimir Castro Alves, Jae Young Do:
MPP 2018 Keynote. 1231 - Yanik Ngoko, Nicolas Saintherant, Christophe Cérin, Denis Trystram:
Invited Paper: How Future Buildings Could Redefine Distributed Computing. 1232-1240
Session 1: Applications
- Victor da Cruz Ferreira, Alexandre Solon Nery, Felipe Maia Galvão França:
A Smart Disk for In-Situ Face Recognition. 1241-1249 - Rodolfo Pereira Araujo, Igor Machado Coelho, Leandro A. J. Marzulo:
A DVND Local Search Implemented on a Dataflow Architecture for the Minimum Latency Problem. 1250-1259
Session 2: Platforms and Tools
- Mahdi Torabzadehkashi, Siavash Rezaei, Vladimir Castro Alves, Nader Bagherzadeh:
CompStor: An In-storage Computation Platform for Scalable Distributed Processing. 1260-1267 - Vanderson Martins do Rosario, Flavia Pisani, Alexandre Rodrigues Gomes, Edson Borin:
Fog-Assisted Translation: Towards Efficient Software Emulation on Heterogeneous IoT Devices. 1268-1277
PMAW: Programming Models and Algorithms Workshop
- Martin Kong, Zoran Budimlic:
Introduction to PMAW 2018. 1278
ROME: Runtime and Operating Systems for the Many-Core Era
- Stefan Lankes, Carsten Clauss, Jens Breitbart:
Introduction to ROME 2018. 1279-1280 - Sang-Hoon Kim:
ROME 2018 Keynote. 1281 - Karl Fuerlinger:
ROME 2018 Invited Talk. 1282 - Brice Goglin:
Memory Footprint of Locality Information on Many-Core Platforms. 1283-1292 - Soramichi Akiyama, Takahiro Hirofuchi, Ryousei Takano:
Diagnosing Performance Fluctuations of High-Throughput Software for Multi-core CPUs. 1293-1302 - Surabhi Jain, Gengbin Zheng, Maria Garzaran, James H. Cownie, Taru Doodi, Terry L. Wilmarth:
Parallelizing MPI Using Tasks for Hybrid Programming Models. 1303-1312 - Lee Savoie, David K. Lowenthal, Bronis R. de Supinski, Kathryn M. Mohror:
A Study of Network Quality of Service in Many-Core MPI Applications. 1313-1322 - Di Wu, Zhanrui Sun, Yongxin Zhu, Li Tian, Hanlin Zhu, Peng Xiong, Zihao Cao, Menglin Wang, Yu Zheng, Chao Xiong, Hao Jiang, Kuen Hung Tsoi, Xinyu Niu, Wei Mao, Can Feng, Xiaowen Zha, Guobao Deng, Wayne Luk:
Custom machine learning architectures: towards realtime anomaly detection for flight testing. 1323-1330 - François Galea, Sergiu Carpov, Lilia Zaourar:
Multi-start simulated annealing for partially-reconfigurable FPGA floorplanning. 1335-1338
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.