Abstract
Heap-based priority queues are very common dynamical data structures used in several fields, ranging from operating systems to scientific applications. However, the rise of new multicore CPUs introduced new challenges in the process of design of these data structures: in addition to traditional requirements like correctness and progress, the scalability is of paramount importance. It is a common opinion that these two demands are partially in conflict each other, so that in these computational environments it is necessary to relax the requirements of correctness and linearizability to achieve high performances. In this paper we introduce a loosely coordinated approach for the management of heap based priority queues on multicore CPUs, with the aim to realize a tradeoff between efficiency and sequential correctness. The approach is based on a sharing of information among only a small number of cores, so that to improve performance without completely losing the features of the data structure. The results obtained on a scientific problem show significant benefits both in terms of parallel efficiency, as well as in term of numerical accuracy.
Similar content being viewed by others
References
Afek, Y., Hakimi, M., Morrison, A.: Quasi-linearizability: relaxed consistency for improved concurrency. In: Proceedings 14th International Conference Principles of Distributed Systems, OPODIS 2010, Lecture Notes in Computer Science Volume 6490, pp. 395–410 (2010)
Alistarh, D., Kopinsky, J., Li, J., Shavit, N.: The SprayList: A Scalable Relaxed Proprity Queue. Microsoft Tecnical Report MSR-TR-2014-16
Arora, N., Blumofe, R., Plaxton, G.: Thread scheduling for multiprogrammed multiprocessors. Theory Comput. Syst. 34, 115–144 (2001)
Ayani, R.: LR\_algorithm: concurrent operations on priority queues. In Proceedings on the 2nd IEEE Symposium on Parallel and Distributed Processing, pp. 22–25 (1991)
Berntsen, J.: Practical error estimation in adaptive multidimensional quadrature routines. J. Comput. Appl. Math. 25, 327–340 (1989)
Berntsen, J., Espelid, T., Genz, A.: Algorithm 698: DCUHRE—an adaptive multidimensional integration routine for a vector of integrals. ACM Trans. Math. Softw. 17, 452–456 (1991)
Cools, R., Rabinowitz, P.: Monomial cubature rules since “Stroud”: a compilation. J. Comput. Appl. Math. 48, 309–326 (1993)
D’Amore, L., Casaburi, D., Galletti, A., Marcellino, L., Murli, A.: Integration of emerging computer technologies for an efficient image sequences analysis. Integr. Comput. Aided Eng. 18, 365–378 (2011)
Dongarra, J., Gannon, D., Fox, G., Kennedy, K.: The impact of multicore on computational science software. CTWatch Q. 3, 1–10 (2007)
Dongarra, J., Foster, I., Fox, G., Gropp, W., Kennedy, K., Torczon, L., White, A.: Sourcebook of Parallel Computing. Morgan Kaufmann Publishers, Burlington (2003)
Flatt, H.P., Kennedy, K.: Performance of parallel processors. Parallel Comput. 12, 1–20 (1989)
Genz, A.: Testing multiple integration software. In: Ford, B., Rault, J.C., Thommaset, F. (eds.) Tools, Methods and Language for Scientic and Engineering Computation. North Holland, New York (1984)
Genz, A., Malik, A.: An embedded family of fully symmetric numerical integration rules. SIAM J. Numer. Anal. 20, 580–588 (1983)
Haas, A., Henzinger, T., Kirsch, C., Lippautz, M., Payer, H., Sezgin, A., Sokolova, A.: Distributed queues in shared memory. In: CF ’13 Proceedings of the ACM International Conference on Computing Frontiers, ACM New York. Article No. 17 (2013)
Hendler, D., Shavit, N., Yerushalmi, L.: A scalable lock-free algorithm. J. Parallel Distrib. Comput. 70, 1–12 (2010)
Henzinger, T., Kirsch, C., Payer, H., Sezgin, A., Sokolova, A.: Quantitative relaxation of concurrent data structure. In: POPL ’13 Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 317–328. ACM, New York (2013)
Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Progr. Lang. Syst. 12, 463–492 (1990)
Herlihy, M.: Wait-free synchronization. ACM Trans. Progr. Lang. Syst. 11, 124–149 (1991)
Herlihy, M., Luchangco, V., Moir, M.: Obstruction-free synchronization: double ended queues as an example. In: Proceedings of 23th International Conference on Distributed Computing System (2003)
Herlihy, M.P., Shavit, N.: The Art of Multiprocessor Programming. Rev. 1st edn. Morgan Kaufmann (2012)
Kessels, J.: On the fly optimization of data structure. Commun. ACM 26, 895–901 (1983)
Kogan, A., Petrank, E.: Wait-free queues with multiple enqueuers and dequeuers. In: PPoPP ’11 Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, pp. 223–234. ACM, New York (2011)
Krommer, A., Ueberhuber, C.: Computational Integration. SIAM, New Delhi (1998)
Laccetti, G., Lapegna, M.: PAMIHR. A parallel FORTRAN program for multidimensional quadrature on distributed memory architectures. Lect. Notes Comput. Sci. 1685, 1144–1148 (1999)
Laccetti, G., Lapegna, M., Mele, V., Romano, D., Murli, A.: A double adaptive algorithm for multidimensional integration on multicore based HPC systems. Int. J. Parallel Progr. 40, 397–409 (2012)
Lapegna, M.: A global adaptive quadrature for the approximate computation of multidimensional integrals on a distributed memory multiprocessor. Concurr Pract. Exp. 4, 413–426 (1992)
Michael, M., Scott, M.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings PODC ’96 Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, pp. 267–275. ACM, New York (1996)
Moir, M., Nussbaum, D., Shalev, O., Shavit, N.: Using elimination to implement scalable and lock-free FIFO queues. In: Proceeding SPAA ’05 Proceedings of the Seventeenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 253–262. ACM, NewYork (2005)
Moir, M., Shavit, N.: Concurrent Data Structures. In: Metha, D., Sahni, S. (eds.) Handbook of Data Structures and Applications, 1st edn, pp. 47-1–47-30. CRC Press, Boca Raton (2005)
Murli, A., D’Amore, L., Laccetti, G., Gregoretti, F., Oliva, G.: A multi-grained distributed implementation of the parallel Block Conjugate Gradient algorithm. Concurr. Comput. Pract. Exp. 22, 2053–2072 (2010)
Nurmi, O., Soisalon-Soininen, E., Wood, D.: Concurrent control in database structures with relaxed balance. In: Proceedings. of the Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 170–176. ACM, New York (1987)
Rao, V., Kumar, V.: Concurrent access of priority queues. IEEE Trans. Comput. 37, 1657–1665 (1988)
Shavit, N.: Data structure in multicore age. Commun. ACM 54, 76–84 (2011)
Shavit, N., Touitou, D.: Elimination tree and the construction of pools and stacks. Theory Comput. Syst. 30, 645–670 (1997)
Sundell, H., Tsigas, P.: Fast and lock-free concurrent priority queues for multi-thread systems. J. Parallel Distrib. Comput. 65, 609–627 (2005)
Wimmer, M., Versaci, F., Traff, J.L., Cedermann, D., Tsigas, P.: Data structure for task-based priority scheduling. In: PPoPP ’14 Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 379–380. ACM, New York (2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Laccetti, G., Lapegna, M. & Mele, V. A Loosely Coordinated Model for Heap-Based Priority Queues in Multicore Environments. Int J Parallel Prog 44, 901–921 (2016). https://doi.org/10.1007/s10766-015-0398-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-015-0398-x