Abstract
This paper presents a method to efficiently place MPI processes on multicore machines. Since MPI implementations often feature efficient supports for both shared-memory and network communication, an adequate placement policy is a crucial step to improve applications performance. As a case study, we show the results obtained for several NAS computing kernels and explain how the policy influences overall performance. In particular, we found out that a policy merely increasing the intranode communication ratio is not enough and that cache utilization is also an influential factor. A more sophisticated policy (eg. one taking into account the architecture’s memory structure) is required to observe performance improvements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Message Passing Interface Forum: MPI-2: Extensions to the message-passing interface (1997), http://www.mpi-forum.org/docs/mpi-20.ps
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B.W., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Argonne National Laboratory: MPICH2 (2004), http://www.mcs.anl.gov/mpi/
Squyres, J., et al.: Portable Linux Processor Affinity (2008), http://www.open-mpi.org/projects/plpa/
Namyst, R., Denneulin, Y., Geib, J.-M., Méhaut, J.-F.: Utilisation des processus légers pour le calcul parallèle distribué: l’approche PM2. Calculateurs Parallèles, Réseaux et Systèmes répartis 10, 237–258 (1998)
Thibault, S., Namyst, R., Wacrenier, P.-A.: Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework. In: EuroPar, Rennes, France. ACM, New York (2007)
Pellegrini, F.: Scotch and LibScotch 5.1 User’s Guide. ScAlApplix project, INRIA Bordeaux – Sud-Ouest, ENSEIRB & LaBRI, UMR CNRS 5800 (2008), http://www.labri.fr/perso/pelegrin/scotch/
Pellegrini, F.: Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs. In: Proceedings of SHPCC 1994, Knoxville, pp. 486–493. IEEE, Los Alamitos (1994)
Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lantri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.-G., Touche, I.: Grid 5000: a large scale and highly reconfigurable experimental Grid testbed. International Journal of High Performance Computing Applications 20, 481–494 (2006)
Buntinas, D., Mercier, G., Gropp, W.: Implementation and Evaluation of Shared-Memory Communication and Synchronization Operations in MPICH2 using the Nemesis Communication Subsystem. In: Parallel Computing, Selected Papers from EuroPVM/MPI 2006, vol. 33, pp. 634–644 (2007)
Myricom: MPICH2-MX (2009), http://www.myri.com/scs/download-mpichmx.html
Träff, J.L.: Implementing the MPI process topology mechanism. In: Supercomputing 2002: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pp. 1–14. IEEE Computer Society Press, Los Alamitos (2002)
Solt, D.: A profile based approach for topology aware MPI rank placement (2007), http://www.tlc2.uh.edu/hpcc07/Schedule/speakers/hpcc_hp-mpi_solt.ppt
Duesterwald, E., Wisniewski, R.W., Sweeney, P.F., Cascaval, G., Smith, S.E.: Method and System for Optimizing Communication in MPI Programs for an Execution Environment (2008), http://www.faqs.org/patents/app/20080288957
Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In: Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2009), Weimar, Germany, pp. 427–436 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mercier, G., Clet-Ortega, J. (2009). Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments. In: Ropo, M., Westerholm, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2009. Lecture Notes in Computer Science, vol 5759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03770-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-03770-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03769-6
Online ISBN: 978-3-642-03770-2
eBook Packages: Computer ScienceComputer Science (R0)