iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://unpaywall.org/10.1007/S10766-016-0429-2
Transparent Speculative Parallelization of Discrete Event Simulation Applications Using Global Variables | International Journal of Parallel Programming Skip to main content
Log in

Transparent Speculative Parallelization of Discrete Event Simulation Applications Using Global Variables

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Parallelizing (compute-intensive) discrete event simulation (DES) applications is a classical approach for speeding up their execution and for making very large/complex simulation models tractable. This has been historically achieved via parallel DES (PDES) techniques, which are based on partitioning the simulation model into distinct simulation objects (somehow resembling objects in classical object-oriented programming), whose states are disjoint, which are executed concurrently and rely on explicit event-exchange (or event-scheduling) primitives as the means to support mutual dependencies and notification of their state updates. With this approach, the application developer is necessarily forced to reason about state separation across the objects, thus being not allowed to rely on shared information, such as global variables, within the application code. This implicitly leads to the shift of the user-exposed programming model to one where sequential-style global variable accesses within the application code are not allowed. In this article we remove this limitation by providing support for managing global variables in the context of DES code developed in ANSI-C, which gets automatically parallelized. Particularly, we focus on speculative (also termed optimistic) PDES systems that run on top of multi-core machines, where simulation objects can concurrently process their events with no guarantee of causal consistency and actual violations of causality rules are recovered through rollback/recovery schemes. In compliance with the nature of speculative processing, in our proposal global variables are transparently mapped to multi-versions, so as to avoid any form of safety predicate verification upon their updates. Consistency is ensured via the introduction of a new rollback/recovery scheme based on detecting global variables’ reads on non-correct versions. At the same time, efficiency in the execution is guaranteed by managing multi-version variables’ lists via non-blocking algorithms. Furthermore, the whole approach is fully transparent, being it based on automatized instrumentation of the application software (particularly ELF objects). Hence the programmer is exposed to the classical (and easy to code) sequential-style programming scheme while accessing any global variable. An experimental assessment of our proposal, based on a suite of case study applications, run on top of an off-the-shelf Linux machine equipped with 32 CPU-cores and 64 GB of RAM, is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. An anti-event is an exact copy of the corresponding event, or of its digest, except for a single-bit value.

  2. The only limitation to unbounded speculation is represented by storage constraints for keeping the speculatively processed/produced data records [25].

  3. Figures are drawn with data taken from http://www.top500.org as of March 2014.

  4. Thread-local storage data rely on segment registers to identify data positioning in memory. However, thread-local storage is out of the scope of our proposal, due to its non-global nature.

  5. To check if the space is up, a counter of available free nodes is kept as well in shared memory, which is managed via an atomic decrement operation.

  6. generate_mark can of course return two equal values when the counter overflows, but this situation can happen after a significant wall-clock-time is elapsed, so we consider it to be statistically non-significant for the ABA problem.

  7. All the samples reported in the experimental study result as the average over 5 observations collected in different runs and with different seeds for pseudo-random generation.

  8. Clearly, these synchronization dynamics are mixed with—hence additional to—those already induced via the simulation of the passage of vehicles across different SOBJs.

References

  1. ACI: Dati e statistiche. http://www.aci.it/?id=54

  2. Adya, A., Liskov, B.: Lazy consistency using loosely synchronized clocks. In: PODC, pp. 73–82 (1997)

  3. Antonacci, F., Pellegrini, A., Quaglia, F.: Consistent and efficient output-stream management in optimistic simulation platform. In: Proceedings of the 2013 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 315–326. ACM (2013)

  4. AUTOMAP: Atlante stradale italia. http://www.automap.it/

  5. Autostrade per L’Italia S.p.A.: Reportistica sul traffico. http://www.autostrade.it/studi/studi_traffico.html

  6. Bauer, D.W., Yaun, G., Carothers, C.D., Yuksel, M., Kalyanaraman, S.: Seven-o’clock: a new distributed gvt algorithm using network atomic operations. In: Proceedings of the 19th Workshop on Parallel and Distributed Simulation, pp. 39–48. IEEE Comp. Soc. (2005)

  7. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman Publishing Co. Inc., Boston (1986)

    Google Scholar 

  8. Brown, R.: Calendar queues: a fast O(1) priority queue implementation for the simulation event set problem. Commun. ACM 31, 1220–1227 (1988)

    Article  Google Scholar 

  9. Bruce, D.: The treatment of state in optimistic systems. In: Proceedings of the 9th Workshop on Parallel and Distributed Simulation, pp. 40–49. IEEE Comp. Soc. (1995)

  10. Burns, J., Lynch, N.A.: Mutual exclusion using invisible reads and writes. In: Proceedings of the 18th Annual Allerton Conference on Communication, Control, and Computing, pp. 833–842 (1980)

  11. Cachopo, J., Rito-Silva, A.: Versioned boxes as the basis for memory transactions. Sci. Comput. Program. 63(2), 172–185 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cai, W., Turner, S.J., Lee, B.S., Zhou, J.: An alternative time management mechanism for distributed simulations. ACM Trans. Model. Comput. Simul. 15(2), 109–137 (2005)

    Article  Google Scholar 

  13. Carothers, C.D., Bauer, D.W., Pearce, S.: ROSS: a high performance modular Time Warp system. In: Proceedings of the 14th Workshop on Parallel and Distributed Simulation, pp. 53–60. IEEE Comp. Soc. (2000)

  14. Carothers, C.D., Bauer, D.W., Pearce, S.: ROSS: a high-performance, low-memory, modular time warp system. J. Parallel Distrib. Comput. 62(11), 1648–1669 (2002)

    Article  MATH  Google Scholar 

  15. Carothers, C.D., Fujimoto, R.M.: Efficient execution of time warp programs on heterogeneous, NOW platforms. IEEE Trans. Parallel Distrib. Syst. 11(3), 299–317 (2000)

    Article  Google Scholar 

  16. Carothers, C.D., Perumalla, K.S.: On deciding between conservative and optimistic approaches on massively parallel platforms. In: Winter Simulation Conference, pp. 678–687 (2010)

  17. Carothers, C.D., Perumalla, K.S., Fujimoto, R.M.: Efficient optimistic parallel simulations using reverse computation. ACM Trans. Model. Comput. Simul. 9(3), 224–253 (1999)

    Article  Google Scholar 

  18. Case, R.P., Padegs, A.: Architecture of the IBM system/370. Commun. ACM 21, 73–96 (1978)

    Article  Google Scholar 

  19. Chandy, K.M., Misra, J.: Distributed simulation: a case study in design and verification of distributed programs. IEEE Trans. Softw. Eng. SE-S 5(5), 440–452 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  20. Chandy, K.M., Sherman, R.: Space–time and simulation. In: Proceedings of the SCS Multiconference on Distributed Simulation, pp. 53–57 (1989)

  21. Chen, L.L., Lu, Y.S., Yao, Y.P., Peng, S.l., Wu, L.D.: A well-balanced time warp system on multi-core environments. In: Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation, PADS, pp. 1–9. IEEE Comp. Soc. (2011)

  22. Cingolani, D., Pellegrini, A., Quaglia, F.: Transparently mixing undo logs and software reversibility for state recovery in optimistic PDES. In: Proceedings of the 3rd ACM Conference on SIGSIM-Principles of Advanced Discrete Simulation, London, United Kingdom, June 10–12, 2015, pp. 211–222 (2015)

  23. Corporation, I.B.M.: IBM System/370 Extended Architecture, Principles of Operation. IBM Publication No. SA22-7085 (1983)

  24. Cucuzzo, D., D’Alessio, S., Quaglia, F., Romano, P.: A lightweight heuristic-based mechanism for collecting committed consistent global states in optimistic simulation. In: IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, vol. 0, pp. 227–234. IEEE Comp. Soc., Los Alamitos, CA (2007)

  25. Das, S.R., Fujimoto, R.M.: Adaptive memory management and optimism control in Time Warp. ACM Trans. Model. Comput. Simul. 7(2), 239–271 (1997)

    Article  MATH  Google Scholar 

  26. Easton, W.B.: Process synchronization without long-term interlock. SIGOPS Oper. Syst. Rev. 6(1/2), 95–100 (1972)

    Article  Google Scholar 

  27. Fabbri, A., Donatiello, L.: Sqtw: a mechanism for state-dependent parallel simulation. description and experimental study. In: Proceedings. 11th Workshop on Parallel and Distributed Simulation, 1997, pp. 82–89 (1997). doi:10.1109/PADS.1997.594590

  28. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  29. Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33(10), 30–53 (1990)

    Article  Google Scholar 

  30. Fujimoto, R.M.: Performance of Time Warp under synthetic workloads. In: Proceedings of the Multiconference on Distributed Simulation, pp. 23–28. Society for Computer Simulation (1990)

  31. Fujimoto, R.M.: Feature article: Parallel discrete event simulation: Will the field survive? INFORMS J. Comput. 5(3), 213–230 (1993)

    Article  Google Scholar 

  32. Fujimoto, R.M.: Exploiting temporal uncertainty in parallel and distributed simulation. In: Proceedings of the 13th Workshop on Parallel and Distributed Simulation, pp. 46–53. IEEE Comp. Soc. (1999)

  33. Fujimoto, R.M., Hybinette, M.: Computing global virtual time in shared-memory multiprocessors. ACM Trans. Model. Comput. Simul. 7(4), 425–446 (1997)

    Article  MATH  Google Scholar 

  34. Gan, B.P., Low, M.Y.H., Wei, J., Wang, X., Turner, S.J., Cai, W.: Synchronization and management of shared state in HLA-based distributed simulation. In: Proceedings of the 2003 Winter Simulation Conference, vol. 1, pp. 847–854. IEEE Computer Society (2003)

  35. Ghosh, K., Fujimoto, R.M.: Parallel discrete event simulation using space-time memory. In: Proceedings of the 1991 International Conference on Parallel Processing, pp. 201–208 (1991)

  36. Glazer, D.W., Tropper, C.: On process migration and load balancing in time warp. IEEE Trans. Parallel Distrib. Syst. 4(3), 318–327 (1993)

    Article  Google Scholar 

  37. Guerraoui, R., Kapalka, M.: On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pp. 175–184. ACM (2008)

  38. Harris, T.: A pragmatic implementation of non-blocking linked-lists. In: Welch, J. (ed.) Distributed Computing. Lecture Notes in Computer Science, vol. 2180, pp. 300–314. Springer, Berlin (2001)

    Google Scholar 

  39. Harris, T.: A pragmatic implementation of non-blocking linked-lists. In: Welch, J. (ed.) Distributed Computing, vol. 2180, pp. 300–314. Springer, Berlin (2001)

    Chapter  Google Scholar 

  40. Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)

    Article  Google Scholar 

  41. Herlihy, M., Shavit, N.: On the nature of progress. In: OPODIS, pp. 313–328 (2011)

  42. Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  43. HPDCS Research Group: ROOT-Sim: The ROme OpTimistic Simulator—v 1.0. http://www.dis.uniroma1.it/~hpdcs/ROOT-Sim/ (2012). https://github.com/HPDCS/ROOT-Sim

  44. IEEE Std 1516-2000: IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)—Framework and Rules. Tech. rep., Institute of Electrical and Electronics Engineers, Inc., New York, NY (2000)

  45. IEEE Std 1516.1-2000: IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)—Federate Interface (FI) Specification. Tech. rep., Institute of Electrical and Electronics Engineers, Inc., New York, NY (2000)

  46. Intel Corporation: IA-32 Intel(R) Architecture Software Developer’s Manual, Volume 2: Instruction Set Reference

  47. Jefferson, D.R.: Virtual time. ACM Trans. Program. Lang. Syst. 7(3), 404–425 (1985)

    Article  MathSciNet  Google Scholar 

  48. Kandukuri, S., Boyd, S.: Optimal power control in interference-limited fading wireless channels with outage-probability specifications. IEEE Trans. Wireless Commun. 1(1), 46–55 (2002)

    Article  Google Scholar 

  49. Lamport, L.: Concurrent reading and writing. Commun. ACM 20(11), 806–811 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  50. Low, M.Y.H., Gan, B.P., Wei, J., Wang, X., Turner, S.J., Cai, W.: Shared state synchronization for hla-based distributed simulation. Simulation 82(8), 511–521 (2006)

    Article  Google Scholar 

  51. Martin, D.E., McBrayer, T.J., Wilsey, P.A.: WARPED: a Time Warp simulation kernel for analysis and application development. In: HICSS ’96: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS’96) Volume 1: Software Technology and Architecture, p. 383. IEEE Comp. Soc. (1996)

  52. Martin, D.E., McBrayer, T.J., Wilsey, P.A.: WARPED: a time warp simulation kernel for analysis and application development. In: Proceedings of the 29th Hawaii International Conference on System Sciences - Volume 1: Software Technology and Architecture, p. 383. IEEE Computer Society, Washington, DC, USA (1996)

  53. Matz, M., Hubicka, J., Jaeger, A., Mitchell, M.: System V Application Binary Interface AMD64 Architecture Processor Supplement (2007). http://www.x86-64.org/documentation.html

  54. Mehl, H., Hammes, S.: How to integrate shared variables in distributed simulation. SIGSIM Simul. Dig. 25(2), 14–41 (1995)

    Article  Google Scholar 

  55. Michael, M.M.: Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst. 15(6), 491–504 (2004)

    Article  Google Scholar 

  56. Nicol, D.M., Liu, X.: The dark side of risk (what your mother never told you about time warp). In: Proceedings of the 11th Workshop on Parallel and Distributed Simulation, PADS, pp. 188–195. IEEE Computer Society (1997)

  57. Pellegrini, A.: Hijacker: Efficient static software instrumentation with applications in high performance computing (poster paper). In: Proceedings of the 2013 International Conference on High Performance Computing & Simulation, HPCS, pp. 650–655. IEEE Computer Society (2013)

  58. Pellegrini, A., Quaglia, F.: Transparent multi-core speculative parallelization of DES models with event and cross-state dependencies. In: SIGSIM Principles of Advanced Discrete Simulation, SIGSIM-PADS ’14, Denver, CO, USA, May 18–21, 2014, pp. 105–116 (2014)

  59. Pellegrini, A., Quaglia, F.: Wait-free global virtual time computation in shared memory time warp systems. In: Proceedings of the 29th IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Paris, France (2014)

  60. Pellegrini, A., Vitali, R., Quaglia, F.: Autonomic state management for optimistic simulation platforms. IEEE Trans. Parallel Distrib. Syst. 26(6), 1560–1569 (2015)

    Article  Google Scholar 

  61. Peluso, S., Didona, D., Quaglia, F.: Supports for transparent object-migration in PDES systems. J. Simul. 6(4), 279–293 (2012)

    Article  Google Scholar 

  62. Preiss, B.R., Loucks, W.M., MacIntyre, D.: Effects of the checkpoint interval on time and space in Time Warp. ACM Trans. Model. Comput. Simul. 4(3), 223–253 (1994)

    Article  MATH  Google Scholar 

  63. Quaglia, F.: A cost model for selecting checkpoint positions in Time Warp parallel simulation. IEEE Trans. Parallel Distrib. Syst. 12(4), 346–362 (2001)

    Article  Google Scholar 

  64. Quaglia, F., Baldoni, R.: Exploiting intra-object dependencies in parallel simulation. Inf. Process. Lett. 70(3), 119–125 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  65. Quaglia, F., Santoro, A.: Non-blocking checkpointing for optimistic parallel simulation: description and an implementation. IEEE Trans. Parallel Distrib. Syst. 14(6), 593–610 (2003)

    Article  Google Scholar 

  66. Romano, P., Palmieri, R., Quaglia, F., Carvalho, N., Rodrigues, L.E.T.: On speculative replication of transactional systems. J. Comput. Syst. Sci. 80(1), 257–276 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  67. Shavit, N., Touitou, D.: Software transactional memory. In: Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing. PODC ’95, pp. 204–213. ACM, New York, NY (1995)

  68. The SCO Group, Inc.: System V Application Binary Interface, Intel386 Architecture Processor Supplement, fourth edn. (1997). http://www.sco.com/developers/devspecs/

  69. Vincent, R., Fox, D., Ko, J., Konolige, K., Limketkai, B., Morisset, B., Ortiz, C., Schulz, D., Stewart, B.: Distributed multirobot exploration, mapping, and task allocation. Ann. Math. Artif. Intell. 52(2–4), 229–255 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  70. Vitali, R., Pellegrini, A., Quaglia, F.: Benchmarking memory management capabilities within root-sim. In: Proceedings of the 13th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications. IEEE Comp. Soc. (2009)

  71. Vitali, R., Pellegrini, A., Quaglia, F.: Load sharing for optimistic parallel simulations on multi core machines. SIGMETRICS Perform. Eval. Rev. 40(3), 2–11 (2012)

    Article  Google Scholar 

  72. Wang, J., Jagtap, D., Abu-Ghazaleh, N.B., Ponomarev, D.: Parallel discrete event simulation for multi-core systems: analysis and optimization. IEEE Trans. Parallel Distrib. Syst. 25(6), 1574–1584 (2014)

    Article  Google Scholar 

  73. Young, C.H., Radhakrishnan, R., Wilsey, P.A.: Optimism: not just for event execution anymore. In: Proceedings of the Thirteenth Workshop on Parallel and Distributed Simulation, PADS ’99, Atlanta, GA, USA, May 1–4, 1999, pp. 136–143 (1999)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Quaglia.

Appendix

Appendix

figure e
figure f
figure g

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pellegrini, A., Peluso, S., Quaglia, F. et al. Transparent Speculative Parallelization of Discrete Event Simulation Applications Using Global Variables. Int J Parallel Prog 44, 1200–1247 (2016). https://doi.org/10.1007/s10766-016-0429-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-016-0429-2

Keywords

Navigation