Abstract
Efficient CMP utilisation requires virtualisation. This forces multiple applications to contend for the same network resources and memory bandwidth. In this paper we study the cause and effect of network congestion with respect to traffic local to the applications, and traffic caused by memory access. This reveals that applications close to the memory controller suffer because of congestion caused by memory controller traffic from other applications. We present a simple mechanism to reduce head-of-line blocking in the switches, which efficiently reduces network congestion, increases network performance, and evens out the performance differences between the CMP applications.
This work has been supported by the project NaNoC (grant agreement no. 248972) which is funded by the European Commission within the Research Programme FP7.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abts, D., Enright Jerger, N.D., Kim, J., Gibson, D., Lipasti, M.H.: Achieving predictable performance through better memory controller placement in many-core CMPs. ACM SIGARCH Computer Architecture News 37(3), 451 (2009), http://portal.acm.org/citation.cfm?doid=1555815.1555810
van den Brand, J., Ciordas, C., Goossens, K., Basten, T.: Congestion-Controlled Best-Effort Communication for Networks-on-Chip. In: 2007 Design, Automation & Test in Europe Conference & Exhibition, pp. 1–6 (April 2007), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4211925
Chen, G., Li, F., Son, S.W., Kandemir, M.: Application mapping for chip multiprocessors. In: Proceedings of the 45th Annual Conference on Design Automation - DAC 2008, p. 620 (2008), http://portal.acm.org/citation.cfm?doid=1391469.1391628
Das, R., Mutlu, O., Kumar, A., Azimi, M.: Application-to-core mapping policies to reduce interference in on-chip networks. Tech. rep., SAFARI Technical Report No. 2011 (2011), http://www.ece.cmu.edu/~omutlu/pub/interference-aware-noc-mapping-TR-SAFARI-2011-001.pdf
Das, R., Mutlu, O., Moscibroda, T., Das, C.R.: Application-aware prioritization mechanisms for on-chip networks. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42, p. 280 (2009), http://portal.acm.org/citation.cfm?doid=1669112.1669150
Flich, J., Bertozzi, D.: Designing Network On-Chip Architectures in the Nanoscale Era. Chapman & Hall/CRC (2010)
Gilabert, F., Gómez, M.E., Medardoni, S., Bertozzi, D.: Improved utilization of noc channel bandwidth by switch replication for cost-effective multi-processor systems-on-chip. In: Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, NOCS 2010, pp. 165–172. IEEE Computer Society, Washington, DC (2010), http://dx.doi.org/10.1109/NOCS.2010.25
Gratz, P., Grot, B., Keckler, S.W.: Regional congestion awareness for load balance in networks-on-chip. In: HPCA, pp. 203–214. IEEE Computer Society (2008)
Grot, B., Keckler, S.W., Mutlu, O.: Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 268–279. ACM (2009), http://portal.acm.org/citation.cfm?id=1669149
Iyer, R., Zhao, L., Guo, F., Illikkal, R., Makineni, S., Newell, D., Solihin, Y., Hsu, L., Reinhardt, S.: QoS policies and architecture for cache/memory in CMP platforms. ACM SIGMETRICS Performance Evaluation Review 35(1), 25 (2007), http://portal.acm.org/citation.cfm?doid=1269899.1254886
Li, M., Zeng, Q.-A., Jone, W.-B.: DyXY: a proximity congestion-aware deadlock-free dynamic routing method for network on chip. In: Proceedings of the 43rd Annual Design Automation Conference, DAC 2006, pp. 849–852. ACM, New York (2006), http://doi.acm.org/10.1145/1146909.1147125
Marescaux, T., Rangevall, A., Nollet, V., Bartic, A., Corporaal, H.: Distributed congestion control for packet switched networks on chip. In: Proceedings of the International Conference of Parallel Computing: Current Future Issues of High-End Computing, vol. 33, pp. 761–768. Citeseer (2005), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.1586&rep=rep1&type=pdf
Mejía, A., Flich, J., Duato, J., Reinemo, S.A., Skeie, T.: Segment-based routing: An efficient fault-tolerant routing algorithm for meshes and tori. In: International Parallel and Distributed Processing Symposium, p. 84 (2006)
Multi2sim Wiki: SPLASH–2 execution commands., http://www.multi2sim.org/wiki/index.php5/SPLASH2_Execution_Commands
NaNoC: NaNoC design platform, http://www.nanoc-project.eu
Roca, S., Flich, J., Silla, F., Duato, J.: VCTlite: Towards an efficient implementation of virtual cut-through switching in on-chip networks. In: International Conference on High Performance Computing (HiPC), pp. 1–12 (2010)
Rodrigo, S., Flich, J., Roca, A., Medardoni, S., Bertozzi, D., Camacho, J., Silla, F., Duato, J.: Addressing manufacturing challenges with cost-efficient fault tolerant routing. In: NOCS 2010: Proceedings of the 4th ACM/IEEE International Symposium on Networks-on-Chip, pp. 25–32 (2010)
Sanchez, D., Michelogiannakis, G., Kozyrakis, C.: An analysis of on-chip interconnection networks for large-scale chip multiprocessors. ACM Transactions on Architecture and Code Optimization (TACO) 7(1), 4 (2010), http://portal.acm.org/citation.cfm?id=1736069
Thottethodi, M., Lebeck, A., Mukherjee, S.: Self-tuned congestion control for multiprocessor networks. In: The Seventh International Symposium on High-Performance Computer Architecture, HPCA, pp. 107–118. IEEE (2001), http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=903256
Triviño, F., Sánchez, J.L., Alfaro, F.J., Flich, J.: Virtualizing network-on-chip resources in chip-multiprocessors. Microprocessors and Microsystems 35(2), 230–245 (2011), http://linkinghub.elsevier.com/retrieve/pii/S0141933110000712
Ubal, R., Sahuquillo, J., Petit, S., López, P.: Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors. In: Proc. of the 19th Int’l Symposium on Computer Architecture and High Performance Computing (2007)
Wu, D., Al-Hashimi, B.M., Schmitz, M.T.: Improving routing efficiency for network-on-chip through contention-aware input selection. In: Proceedings of the 2006 Asia and South Pacific Design Automation Conference, ASP-DAC 2006, pp. 36–41. IEEE Press, Piscataway (2006), http://dx.doi.org/10.1145/1118299.1118310
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rodrigo, S., Sem-Jacobsen, F.O., Tatenguem, H., Skeie, T., Bertozzi, D. (2012). Cost-Effective Contention Avoidance in a CMP with Shared Memory Controllers. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_73
Download citation
DOI: https://doi.org/10.1007/978-3-642-32820-6_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)