Abstract
Time-to-solution is an important metric when parallelizing existing code. The REPARA approach provides a systematic way to instantiate stream and data parallel patterns by annotating the sequential source code with \({\mathtt {C}}\)++\({\mathtt {11}}\) attributes. Annotations are automatically transformed in a target parallel code that uses existing libraries for parallel programming (e.g., FastFlow). In this paper, we apply this approach for the parallelization of a data stream processing application. The description shows the effectiveness of the approach in easily and quickly prototyping several parallel variants of the sequential code by obtaining good overall performance in terms of both throughput and latency.
Similar content being viewed by others
Notes
At the time of writing, this phase is hand-made and not fully automatized.
REPARA imposes restrictions on the source code when targeting specific hardware.
The trades and quotes NASDAQ tracefile of 30 Oct 2014, downloadable at http://www.nyxdata.com.
References
Andrade H, Gedik B, Turaga D (2014) Fundamentals of stream processing. Cambridge University Press, Cambridge
Cugola G, Margara A (2012) Processing flows of information: From data stream to complex event processing. ACM Comput Surv 44(3):15:1–15:62
Castro Fernandez R, Migliavacca M, Kalyvianaki E, Pietzuch P (2013) Integrating scale out and fault tolerance in stream processing using operator state management. In: Proc. of the 2013 ACM SIGMOD international conference on management of data, SIGMOD ’13. ACM, New York, pp 725–736
Chapman B, Jost G, Pas Rvd (2007) Using OpenMP: portable shared memory parallel programming (scientific and engineering computation). The MIT Press, USA
Danelutto M, De Matteis T., Mencagli G, Torquati M (2015) Parallelizing high-frequency trading applications by using c++11 attributes. In: Proc. of the 1st IEEE Inter. workshop on reengineering for parallelism in heterogeneous parallel platforms
Danelutto M, Garcia JD, Sanchez LM, Sotomayor R, Torquati, M (2016) Introducing parallelism by using repara c++11 attributes. In: Proc. of the 17th Euromicro PDP 2016: parallel distributed and network-based processing. IEEE, Crete
Danelutto M, Torquati M (2015) Structured parallel programming with “core” fastflow. In: Zsók V, Horváth Z, Csató L (eds) Central European functional programming school. vol 8606, Springer, LNCS, pp 29–75
De Matteis T, Mencagli G (2016) Keep calm and react with foresight: strategies for low- latency and energy-efficient elastic data stream processing. In: Proceedings of the 21th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2016. ACM, New York
Enterprise C, Inc. (2011) C, NVIDIA, the Portland Group: The OpenACC Application Programming Interface, v1.0a
FastFlow website (2015). http://mc-fastflow.sourceforge.net/
Gulisano V, Jimenez-Peris R, Patino-Martinez M, Soriente C, Valduriez P (2012) Streamcloud: An elastic and scalable data streaming system. IEEE Trans Parallel Distrib Syst 23(12):2351–2365
IBM Infosphere Streams website (2015). http://www-03.ibm.com/software/products/en/ibm-streams
Apache Spark Streaming website (2015). https://spark.apache.org/streaming
Apache Storm website (2015). https://storm.apache.org
Intel\(\textregistered \) TBB website (2015). http://threadingbuildingblocks.org
Leijen D, Schulte W, Burckhardt S (2009) The design of a task parallel library. In: Proc. of the 24th ACM SIGPLAN conference on object oriented programming systems languages and applications, OOPSLA ’09, ACM, New York, pp 227–242
Blumofe RD, Joerg CF, Kuszmaul BC, Leiserson CE, Randall KH, Zhou Y (1995) Cilk: an efficient multithreaded runtime system. SIGPLAN Not 30(8):207–216
Kramer P, Egloff D, Blaser L (2016) The alea reactive dataflow system for gpu parallelization. In: Proc. of the HLGPU 2016 Workshop, HiPEAC 2016, Prague
REPARA website (2016). http://repara-project.eu/
ISO/IEC (2011) Information technology—Programming languages—C++. International Standard ISO/IEC 14882:20111, ISO/IEC, Geneva
REPARA Project Deliverable, “D2.1: REPARA C++ Open Specification document” (2015)
Andrade H, Gedik B, Wu KL, Yu PS (2011) Processing high data rate streams in system s. J Parallel Distrib Comput 71(2):145–156
Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symp. on principles of database systems, PODS ’02, ACM, New York, pp 1–16
Aldinucci M, Campa S, Danelutto M, Kilpatrick P, Torquati M (2014) Design patterns percolating to parallel programming framework implementation. Int J Parallel Program 42(6):1012–1031
Balkesen C, Tatbul N (2011) Scalable data partitioning techniques for parallel sliding window processing over data streams. In: VLDB Inter. workshop on data management for sensor networks (DMSN’11), Seattle
Mattson T, Sanders B, Massingill B (2004) Patterns for parallel programming, 1st edn. Addison-Wesley Professional, USA
Thies W, Karczmarek M, Amarasinghe SP (2002) Streamit: a language for streaming applications. In: Proc. of the 11th Inter. conference on compiler construction, CC ’02. Springer-Verlag, London, pp 179–196
REPARA Project Deliverable, “D2.2: Static analysis techniques for AIR generation”. Available at: http://repara-project.eu/
REPARA Project Deliverable, “D3.3: Static partitioning tool” (2015)
Acknowledgments
This work was partially supported by the EU FP7 project REPARA (ICT-609666).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Danelutto, M., De Matteis, T., Mencagli, G. et al. Data stream processing via code annotations. J Supercomput 74, 5659–5673 (2018). https://doi.org/10.1007/s11227-016-1793-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1793-9