iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.1177/1094342012452893
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T04:10:41Z","timestamp":1725941441095},"reference-count":34,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2012,8,9]],"date-time":"2012-08-09T00:00:00Z","timestamp":1344470400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2012,11]]},"abstract":" Graph matching is a prototypical combinatorial problem with many applications in high-performance scientific computing. Optimal algorithms for computing matchings are challenging to parallelize. Approximation algorithms are amenable to parallelization and are therefore important to compute matchings for large-scale problems. Approximation algorithms also generate nearly optimal solutions that are sufficient for many applications. In this paper we present multithreaded algorithms for computing half-approximate weighted matching on state-of-the-art multicore (Intel Nehalem and AMD Magny-Cours), manycore (Nvidia Tesla and Nvidia Fermi), and massively multithreaded (Cray XMT) platforms. We provide two implementations: the first uses shared work queues and is suited for all platforms; and the second implementation, based on dataflow principles, exploits special features available on the Cray XMT. Using a carefully chosen dataset that exhibits characteristics from a wide range of applications, we show scalable performance across different platforms. In particular, for one instance of the input, an R-MAT graph (RMAT-G), we show speedups of about [Formula: see text] on [Formula: see text] cores of an AMD Magny-Cours, [Formula: see text] on [Formula: see text] cores of Intel Nehalem, [Formula: see text] on Nvidia Tesla and [Formula: see text] on Nvidia Fermi relative to one core of Intel Nehalem, and [Formula: see text] on [Formula: see text] processors of Cray XMT. We demonstrate strong as well as weak scaling for graphs with up to a billion edges using up to 12,800 threads. We avoid excessive fine-tuning for each platform and retain the basic structure of the algorithm uniformly across platforms. An exception is the dataflow algorithm designed specifically for the Cray XMT. To the best of the authors' knowledge, this is the first such large-scale study of the half-approximate weighted matching problem on multithreaded platforms. Driven by the critical enabling role of combinatorial algorithms such as matching in scientific computing and the emergence of informatics applications, there is a growing demand to support irregular computations on current and future computing platforms. In this context, we evaluate the capability of emerging multithreaded platforms to tolerate latency induced by irregular memory access patterns, and to support fine-grained parallelism via light-weight synchronization mechanisms. By contrasting the architectural features of these platforms against the Cray XMT, which is specifically designed to support irregular memory-intensive applications, we delineate the impact of these choices on performance. <\/jats:p>","DOI":"10.1177\/1094342012452893","type":"journal-article","created":{"date-parts":[[2012,8,12]],"date-time":"2012-08-12T00:45:26Z","timestamp":1344732326000},"page":"413-430","update-policy":"http:\/\/dx.doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":23,"title":["Approximate weighted matching on emerging manycore and multithreaded architectures"],"prefix":"10.1177","volume":"26","author":[{"given":"Mahantesh","family":"Halappanavar","sequence":"first","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA, USA"}]},{"given":"John","family":"Feo","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA, USA"}]},{"given":"Oreste","family":"Villa","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA, USA"}]},{"given":"Antonino","family":"Tumeo","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA, USA"}]},{"given":"Alex","family":"Pothen","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]}],"member":"179","published-online":{"date-parts":[[2012,8,9]]},"reference":[{"key":"bibr1-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2006.1639360"},{"key":"bibr2-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.46"},{"volume-title":"The Landscape of Parallel Computing Research: A View from Berkeley","year":"2006","author":"Asanovic K","key":"bibr3-1094342012452893"},{"key":"bibr4-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1002\/net.3230130404"},{"key":"bibr5-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2006.34"},{"key":"bibr6-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2008.4536261"},{"key":"bibr7-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/34.993558"},{"journal-title":"SIAM Conference on Computational Science and Engineering (CSE11)","year":"2011","author":"Catalyurek U","key":"bibr8-1094342012452893"},{"key":"bibr9-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/1132952.1132954"},{"key":"bibr10-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74742-0_15"},{"key":"bibr11-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-0190(02)00393-9"},{"key":"bibr12-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1137\/S0895479897317661"},{"key":"bibr13-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/1062261.1062268"},{"volume-title":"Algorithms for Vertex-weighted Matching in Graphs","year":"2009","author":"Halappanavar M","key":"bibr14-1094342012452893"},{"key":"bibr15-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-77220-0_21"},{"key":"bibr16-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2008.56"},{"journal-title":"Computing Research Repository (CoRR)","year":"2004","author":"Hoepman J-H","key":"bibr17-1094342012452893"},{"key":"bibr18-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/1941553.1941590"},{"key":"bibr19-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/224170.224229"},{"first-page":"47","volume-title":"Proceedings of the 23\u00a0rd ACM SIGGRAPH\/EUROGRAPHICS Symposium on Graphics Hardware (GH '08)","year":"2008","author":"Katz GJ","key":"bibr20-1094342012452893"},{"first-page":"1","volume-title":"Supercomputing '98: Proceedings of the 1998 ACM\/IEEE conference on Supercomputing (CDROM)","year":"1998","author":"Li XS","key":"bibr21-1094342012452893"},{"volume-title":"Matching Theory (North-Holland Mathematics Studies)","year":"1986","author":"Lovasz L","key":"bibr22-1094342012452893"},{"key":"bibr23-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626407002843"},{"key":"bibr24-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/1837274.1837289"},{"first-page":"708","volume-title":"The Seventh International Conference on Parallel Processing and Applied Mathematics","year":"2007","author":"Manne F","key":"bibr25-1094342012452893"},{"key":"bibr26-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2009.22"},{"key":"bibr27-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(00)00049-1"},{"key":"bibr28-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/1242531.1242541"},{"key":"bibr29-1094342012452893","first-page":"122","volume":"22","author":"Pinar A","year":"2006","journal-title":"Electronic Transactions on Numerical Analysis"},{"key":"bibr30-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1145\/98267.98287"},{"key":"bibr31-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-49116-3_24"},{"key":"bibr32-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1007\/s10589-006-9003-y"},{"volume-title":"Combinatorial Optimization - Polyhedra and Efficiency","year":"2003","author":"Schrijver A","key":"bibr33-1094342012452893"},{"key":"bibr34-1094342012452893","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.4"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342012452893","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342012452893","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342012452893","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T03:27:32Z","timestamp":1725938852000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342012452893"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,8,9]]},"references-count":34,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1177\/1094342012452893"],"URL":"http:\/\/dx.doi.org\/10.1177\/1094342012452893","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2012,8,9]]}}}