Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Mercier, Guillaume; Clet-Ortega, Jérôme

doi:10.1007/978-3-642-03770-2_17

Guillaume Mercier¹⁸ &
Jérôme Clet-Ortega¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5759))

Included in the following conference series:

European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting

1264 Accesses
43 Citations

Abstract

This paper presents a method to efficiently place MPI processes on multicore machines. Since MPI implementations often feature efficient supports for both shared-memory and network communication, an adequate placement policy is a crucial step to improve applications performance. As a case study, we show the results obtained for several NAS computing kernels and explain how the policy influences overall performance. In particular, we found out that a policy merely increasing the intranode communication ratio is not enough and that cache utilization is also an influential factor. A more sophisticated policy (eg. one taking into account the architecture’s memory structure) is required to observe performance improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

To Share or Not to Share: A Case for MPI in Shared-Memory

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

ClustMap: A Topology-Aware MPI Process Placement Algorithm for Multi-core Clusters

References

Message Passing Interface Forum: MPI-2: Extensions to the message-passing interface (1997), http://www.mpi-forum.org/docs/mpi-20.ps
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B.W., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Chapter Google Scholar
Argonne National Laboratory: MPICH2 (2004), http://www.mcs.anl.gov/mpi/
Squyres, J., et al.: Portable Linux Processor Affinity (2008), http://www.open-mpi.org/projects/plpa/
Namyst, R., Denneulin, Y., Geib, J.-M., Méhaut, J.-F.: Utilisation des processus légers pour le calcul parallèle distribué: l’approche PM2. Calculateurs Parallèles, Réseaux et Systèmes répartis 10, 237–258 (1998)
Google Scholar
Thibault, S., Namyst, R., Wacrenier, P.-A.: Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework. In: EuroPar, Rennes, France. ACM, New York (2007)
Google Scholar
Pellegrini, F.: Scotch and LibScotch 5.1 User’s Guide. ScAlApplix project, INRIA Bordeaux – Sud-Ouest, ENSEIRB & LaBRI, UMR CNRS 5800 (2008), http://www.labri.fr/perso/pelegrin/scotch/
Pellegrini, F.: Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs. In: Proceedings of SHPCC 1994, Knoxville, pp. 486–493. IEEE, Los Alamitos (1994)
Google Scholar
Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lantri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.-G., Touche, I.: Grid 5000: a large scale and highly reconfigurable experimental Grid testbed. International Journal of High Performance Computing Applications 20, 481–494 (2006)
Article Google Scholar
Buntinas, D., Mercier, G., Gropp, W.: Implementation and Evaluation of Shared-Memory Communication and Synchronization Operations in MPICH2 using the Nemesis Communication Subsystem. In: Parallel Computing, Selected Papers from EuroPVM/MPI 2006, vol. 33, pp. 634–644 (2007)
Google Scholar
Myricom: MPICH2-MX (2009), http://www.myri.com/scs/download-mpichmx.html
Träff, J.L.: Implementing the MPI process topology mechanism. In: Supercomputing 2002: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pp. 1–14. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Solt, D.: A profile based approach for topology aware MPI rank placement (2007), http://www.tlc2.uh.edu/hpcc07/Schedule/speakers/hpcc_hp-mpi_solt.ppt
Duesterwald, E., Wisniewski, R.W., Sweeney, P.F., Cascaval, G., Smith, S.E.: Method and System for Optimizing Communication in MPI Programs for an Execution Environment (2008), http://www.faqs.org/patents/app/20080288957
Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In: Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2009), Weimar, Germany, pp. 427–436 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Bordeaux - INRIA - LaBRI, 351, cours de la Libération, F-33405, Talence cedex, France
Guillaume Mercier & Jérôme Clet-Ortega

Authors

Guillaume Mercier
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Clet-Ortega
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Technology, Åbo Akademi, 20500, Turku, Finland
Matti Ropo & Jan Westerholm &
Department of Electrical Engineering and Computer Science, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mercier, G., Clet-Ortega, J. (2009). Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments. In: Ropo, M., Westerholm, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2009. Lecture Notes in Computer Science, vol 5759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03770-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-03770-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03769-6
Online ISBN: 978-3-642-03770-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

To Share or Not to Share: A Case for MPI in Shared-Memory

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

ClustMap: A Topology-Aware MPI Process Placement Algorithm for Multi-core Clusters

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

To Share or Not to Share: A Case for MPI in Shared-Memory

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

ClustMap: A Topology-Aware MPI Process Placement Algorithm for Multi-core Clusters

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation