Abstract
Parallelizing (compute-intensive) discrete event simulation (DES) applications is a classical approach for speeding up their execution and for making very large/complex simulation models tractable. This has been historically achieved via parallel DES (PDES) techniques, which are based on partitioning the simulation model into distinct simulation objects (somehow resembling objects in classical object-oriented programming), whose states are disjoint, which are executed concurrently and rely on explicit event-exchange (or event-scheduling) primitives as the means to support mutual dependencies and notification of their state updates. With this approach, the application developer is necessarily forced to reason about state separation across the objects, thus being not allowed to rely on shared information, such as global variables, within the application code. This implicitly leads to the shift of the user-exposed programming model to one where sequential-style global variable accesses within the application code are not allowed. In this article we remove this limitation by providing support for managing global variables in the context of DES code developed in ANSI-C, which gets automatically parallelized. Particularly, we focus on speculative (also termed optimistic) PDES systems that run on top of multi-core machines, where simulation objects can concurrently process their events with no guarantee of causal consistency and actual violations of causality rules are recovered through rollback/recovery schemes. In compliance with the nature of speculative processing, in our proposal global variables are transparently mapped to multi-versions, so as to avoid any form of safety predicate verification upon their updates. Consistency is ensured via the introduction of a new rollback/recovery scheme based on detecting global variables’ reads on non-correct versions. At the same time, efficiency in the execution is guaranteed by managing multi-version variables’ lists via non-blocking algorithms. Furthermore, the whole approach is fully transparent, being it based on automatized instrumentation of the application software (particularly ELF objects). Hence the programmer is exposed to the classical (and easy to code) sequential-style programming scheme while accessing any global variable. An experimental assessment of our proposal, based on a suite of case study applications, run on top of an off-the-shelf Linux machine equipped with 32 CPU-cores and 64 GB of RAM, is also presented.
Similar content being viewed by others
Notes
An anti-event is an exact copy of the corresponding event, or of its digest, except for a single-bit value.
The only limitation to unbounded speculation is represented by storage constraints for keeping the speculatively processed/produced data records [25].
Figures are drawn with data taken from http://www.top500.org as of March 2014.
Thread-local storage data rely on segment registers to identify data positioning in memory. However, thread-local storage is out of the scope of our proposal, due to its non-global nature.
To check if the space is up, a counter of available free nodes is kept as well in shared memory, which is managed via an atomic decrement operation.
generate_mark can of course return two equal values when the counter overflows, but this situation can happen after a significant wall-clock-time is elapsed, so we consider it to be statistically non-significant for the ABA problem.
All the samples reported in the experimental study result as the average over 5 observations collected in different runs and with different seeds for pseudo-random generation.
Clearly, these synchronization dynamics are mixed with—hence additional to—those already induced via the simulation of the passage of vehicles across different SOBJs.
References
ACI: Dati e statistiche. http://www.aci.it/?id=54
Adya, A., Liskov, B.: Lazy consistency using loosely synchronized clocks. In: PODC, pp. 73–82 (1997)
Antonacci, F., Pellegrini, A., Quaglia, F.: Consistent and efficient output-stream management in optimistic simulation platform. In: Proceedings of the 2013 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 315–326. ACM (2013)
AUTOMAP: Atlante stradale italia. http://www.automap.it/
Autostrade per L’Italia S.p.A.: Reportistica sul traffico. http://www.autostrade.it/studi/studi_traffico.html
Bauer, D.W., Yaun, G., Carothers, C.D., Yuksel, M., Kalyanaraman, S.: Seven-o’clock: a new distributed gvt algorithm using network atomic operations. In: Proceedings of the 19th Workshop on Parallel and Distributed Simulation, pp. 39–48. IEEE Comp. Soc. (2005)
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman Publishing Co. Inc., Boston (1986)
Brown, R.: Calendar queues: a fast O(1) priority queue implementation for the simulation event set problem. Commun. ACM 31, 1220–1227 (1988)
Bruce, D.: The treatment of state in optimistic systems. In: Proceedings of the 9th Workshop on Parallel and Distributed Simulation, pp. 40–49. IEEE Comp. Soc. (1995)
Burns, J., Lynch, N.A.: Mutual exclusion using invisible reads and writes. In: Proceedings of the 18th Annual Allerton Conference on Communication, Control, and Computing, pp. 833–842 (1980)
Cachopo, J., Rito-Silva, A.: Versioned boxes as the basis for memory transactions. Sci. Comput. Program. 63(2), 172–185 (2006)
Cai, W., Turner, S.J., Lee, B.S., Zhou, J.: An alternative time management mechanism for distributed simulations. ACM Trans. Model. Comput. Simul. 15(2), 109–137 (2005)
Carothers, C.D., Bauer, D.W., Pearce, S.: ROSS: a high performance modular Time Warp system. In: Proceedings of the 14th Workshop on Parallel and Distributed Simulation, pp. 53–60. IEEE Comp. Soc. (2000)
Carothers, C.D., Bauer, D.W., Pearce, S.: ROSS: a high-performance, low-memory, modular time warp system. J. Parallel Distrib. Comput. 62(11), 1648–1669 (2002)
Carothers, C.D., Fujimoto, R.M.: Efficient execution of time warp programs on heterogeneous, NOW platforms. IEEE Trans. Parallel Distrib. Syst. 11(3), 299–317 (2000)
Carothers, C.D., Perumalla, K.S.: On deciding between conservative and optimistic approaches on massively parallel platforms. In: Winter Simulation Conference, pp. 678–687 (2010)
Carothers, C.D., Perumalla, K.S., Fujimoto, R.M.: Efficient optimistic parallel simulations using reverse computation. ACM Trans. Model. Comput. Simul. 9(3), 224–253 (1999)
Case, R.P., Padegs, A.: Architecture of the IBM system/370. Commun. ACM 21, 73–96 (1978)
Chandy, K.M., Misra, J.: Distributed simulation: a case study in design and verification of distributed programs. IEEE Trans. Softw. Eng. SE-S 5(5), 440–452 (1979)
Chandy, K.M., Sherman, R.: Space–time and simulation. In: Proceedings of the SCS Multiconference on Distributed Simulation, pp. 53–57 (1989)
Chen, L.L., Lu, Y.S., Yao, Y.P., Peng, S.l., Wu, L.D.: A well-balanced time warp system on multi-core environments. In: Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation, PADS, pp. 1–9. IEEE Comp. Soc. (2011)
Cingolani, D., Pellegrini, A., Quaglia, F.: Transparently mixing undo logs and software reversibility for state recovery in optimistic PDES. In: Proceedings of the 3rd ACM Conference on SIGSIM-Principles of Advanced Discrete Simulation, London, United Kingdom, June 10–12, 2015, pp. 211–222 (2015)
Corporation, I.B.M.: IBM System/370 Extended Architecture, Principles of Operation. IBM Publication No. SA22-7085 (1983)
Cucuzzo, D., D’Alessio, S., Quaglia, F., Romano, P.: A lightweight heuristic-based mechanism for collecting committed consistent global states in optimistic simulation. In: IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, vol. 0, pp. 227–234. IEEE Comp. Soc., Los Alamitos, CA (2007)
Das, S.R., Fujimoto, R.M.: Adaptive memory management and optimism control in Time Warp. ACM Trans. Model. Comput. Simul. 7(2), 239–271 (1997)
Easton, W.B.: Process synchronization without long-term interlock. SIGOPS Oper. Syst. Rev. 6(1/2), 95–100 (1972)
Fabbri, A., Donatiello, L.: Sqtw: a mechanism for state-dependent parallel simulation. description and experimental study. In: Proceedings. 11th Workshop on Parallel and Distributed Simulation, 1997, pp. 82–89 (1997). doi:10.1109/PADS.1997.594590
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33(10), 30–53 (1990)
Fujimoto, R.M.: Performance of Time Warp under synthetic workloads. In: Proceedings of the Multiconference on Distributed Simulation, pp. 23–28. Society for Computer Simulation (1990)
Fujimoto, R.M.: Feature article: Parallel discrete event simulation: Will the field survive? INFORMS J. Comput. 5(3), 213–230 (1993)
Fujimoto, R.M.: Exploiting temporal uncertainty in parallel and distributed simulation. In: Proceedings of the 13th Workshop on Parallel and Distributed Simulation, pp. 46–53. IEEE Comp. Soc. (1999)
Fujimoto, R.M., Hybinette, M.: Computing global virtual time in shared-memory multiprocessors. ACM Trans. Model. Comput. Simul. 7(4), 425–446 (1997)
Gan, B.P., Low, M.Y.H., Wei, J., Wang, X., Turner, S.J., Cai, W.: Synchronization and management of shared state in HLA-based distributed simulation. In: Proceedings of the 2003 Winter Simulation Conference, vol. 1, pp. 847–854. IEEE Computer Society (2003)
Ghosh, K., Fujimoto, R.M.: Parallel discrete event simulation using space-time memory. In: Proceedings of the 1991 International Conference on Parallel Processing, pp. 201–208 (1991)
Glazer, D.W., Tropper, C.: On process migration and load balancing in time warp. IEEE Trans. Parallel Distrib. Syst. 4(3), 318–327 (1993)
Guerraoui, R., Kapalka, M.: On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pp. 175–184. ACM (2008)
Harris, T.: A pragmatic implementation of non-blocking linked-lists. In: Welch, J. (ed.) Distributed Computing. Lecture Notes in Computer Science, vol. 2180, pp. 300–314. Springer, Berlin (2001)
Harris, T.: A pragmatic implementation of non-blocking linked-lists. In: Welch, J. (ed.) Distributed Computing, vol. 2180, pp. 300–314. Springer, Berlin (2001)
Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)
Herlihy, M., Shavit, N.: On the nature of progress. In: OPODIS, pp. 313–328 (2011)
Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
HPDCS Research Group: ROOT-Sim: The ROme OpTimistic Simulator—v 1.0. http://www.dis.uniroma1.it/~hpdcs/ROOT-Sim/ (2012). https://github.com/HPDCS/ROOT-Sim
IEEE Std 1516-2000: IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)—Framework and Rules. Tech. rep., Institute of Electrical and Electronics Engineers, Inc., New York, NY (2000)
IEEE Std 1516.1-2000: IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)—Federate Interface (FI) Specification. Tech. rep., Institute of Electrical and Electronics Engineers, Inc., New York, NY (2000)
Intel Corporation: IA-32 Intel(R) Architecture Software Developer’s Manual, Volume 2: Instruction Set Reference
Jefferson, D.R.: Virtual time. ACM Trans. Program. Lang. Syst. 7(3), 404–425 (1985)
Kandukuri, S., Boyd, S.: Optimal power control in interference-limited fading wireless channels with outage-probability specifications. IEEE Trans. Wireless Commun. 1(1), 46–55 (2002)
Lamport, L.: Concurrent reading and writing. Commun. ACM 20(11), 806–811 (1977)
Low, M.Y.H., Gan, B.P., Wei, J., Wang, X., Turner, S.J., Cai, W.: Shared state synchronization for hla-based distributed simulation. Simulation 82(8), 511–521 (2006)
Martin, D.E., McBrayer, T.J., Wilsey, P.A.: WARPED: a Time Warp simulation kernel for analysis and application development. In: HICSS ’96: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS’96) Volume 1: Software Technology and Architecture, p. 383. IEEE Comp. Soc. (1996)
Martin, D.E., McBrayer, T.J., Wilsey, P.A.: WARPED: a time warp simulation kernel for analysis and application development. In: Proceedings of the 29th Hawaii International Conference on System Sciences - Volume 1: Software Technology and Architecture, p. 383. IEEE Computer Society, Washington, DC, USA (1996)
Matz, M., Hubicka, J., Jaeger, A., Mitchell, M.: System V Application Binary Interface AMD64 Architecture Processor Supplement (2007). http://www.x86-64.org/documentation.html
Mehl, H., Hammes, S.: How to integrate shared variables in distributed simulation. SIGSIM Simul. Dig. 25(2), 14–41 (1995)
Michael, M.M.: Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst. 15(6), 491–504 (2004)
Nicol, D.M., Liu, X.: The dark side of risk (what your mother never told you about time warp). In: Proceedings of the 11th Workshop on Parallel and Distributed Simulation, PADS, pp. 188–195. IEEE Computer Society (1997)
Pellegrini, A.: Hijacker: Efficient static software instrumentation with applications in high performance computing (poster paper). In: Proceedings of the 2013 International Conference on High Performance Computing & Simulation, HPCS, pp. 650–655. IEEE Computer Society (2013)
Pellegrini, A., Quaglia, F.: Transparent multi-core speculative parallelization of DES models with event and cross-state dependencies. In: SIGSIM Principles of Advanced Discrete Simulation, SIGSIM-PADS ’14, Denver, CO, USA, May 18–21, 2014, pp. 105–116 (2014)
Pellegrini, A., Quaglia, F.: Wait-free global virtual time computation in shared memory time warp systems. In: Proceedings of the 29th IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Paris, France (2014)
Pellegrini, A., Vitali, R., Quaglia, F.: Autonomic state management for optimistic simulation platforms. IEEE Trans. Parallel Distrib. Syst. 26(6), 1560–1569 (2015)
Peluso, S., Didona, D., Quaglia, F.: Supports for transparent object-migration in PDES systems. J. Simul. 6(4), 279–293 (2012)
Preiss, B.R., Loucks, W.M., MacIntyre, D.: Effects of the checkpoint interval on time and space in Time Warp. ACM Trans. Model. Comput. Simul. 4(3), 223–253 (1994)
Quaglia, F.: A cost model for selecting checkpoint positions in Time Warp parallel simulation. IEEE Trans. Parallel Distrib. Syst. 12(4), 346–362 (2001)
Quaglia, F., Baldoni, R.: Exploiting intra-object dependencies in parallel simulation. Inf. Process. Lett. 70(3), 119–125 (1999)
Quaglia, F., Santoro, A.: Non-blocking checkpointing for optimistic parallel simulation: description and an implementation. IEEE Trans. Parallel Distrib. Syst. 14(6), 593–610 (2003)
Romano, P., Palmieri, R., Quaglia, F., Carvalho, N., Rodrigues, L.E.T.: On speculative replication of transactional systems. J. Comput. Syst. Sci. 80(1), 257–276 (2014)
Shavit, N., Touitou, D.: Software transactional memory. In: Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing. PODC ’95, pp. 204–213. ACM, New York, NY (1995)
The SCO Group, Inc.: System V Application Binary Interface, Intel386 Architecture Processor Supplement, fourth edn. (1997). http://www.sco.com/developers/devspecs/
Vincent, R., Fox, D., Ko, J., Konolige, K., Limketkai, B., Morisset, B., Ortiz, C., Schulz, D., Stewart, B.: Distributed multirobot exploration, mapping, and task allocation. Ann. Math. Artif. Intell. 52(2–4), 229–255 (2008)
Vitali, R., Pellegrini, A., Quaglia, F.: Benchmarking memory management capabilities within root-sim. In: Proceedings of the 13th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications. IEEE Comp. Soc. (2009)
Vitali, R., Pellegrini, A., Quaglia, F.: Load sharing for optimistic parallel simulations on multi core machines. SIGMETRICS Perform. Eval. Rev. 40(3), 2–11 (2012)
Wang, J., Jagtap, D., Abu-Ghazaleh, N.B., Ponomarev, D.: Parallel discrete event simulation for multi-core systems: analysis and optimization. IEEE Trans. Parallel Distrib. Syst. 25(6), 1574–1584 (2014)
Young, C.H., Radhakrishnan, R., Wilsey, P.A.: Optimism: not just for event execution anymore. In: Proceedings of the Thirteenth Workshop on Parallel and Distributed Simulation, PADS ’99, Atlanta, GA, USA, May 1–4, 1999, pp. 136–143 (1999)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Pellegrini, A., Peluso, S., Quaglia, F. et al. Transparent Speculative Parallelization of Discrete Event Simulation Applications Using Global Variables. Int J Parallel Prog 44, 1200–1247 (2016). https://doi.org/10.1007/s10766-016-0429-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-016-0429-2