iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.1007/S10766-022-00730-9
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T11:52:40Z","timestamp":1723031560165},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["146371743"],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005713","name":"Technische Universit\u00e4t M\u00fcnchen","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005713","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Parallel Prog"],"published-print":{"date-parts":[[2022,4]]},"abstract":"Abstract<\/jats:title>To minimize power consumption while maximizing performance, today\u2019s multicore processors rely on fine-grained run-time dynamic power information\u2014both in the time domain, e.g.\u00a0$$\\mu $$<\/jats:tex-math>\n \u03bc<\/mml:mi>\n <\/mml:math><\/jats:alternatives><\/jats:inline-formula>s to ms, and space domain, e.g.\u00a0core-level. The state-of-the-art for deriving such power information is mainly based on predetermined power models which use linear modeling techniques to determine the core-performance\/core-power relationship. However, with multicore processors becoming ever more complex, linear modeling techniques cannot capture all possible core-performance related power states anymore. Although artificial neural networks (ANN) have been proposed for coarse-grained power modeling of servers with time resolutions in the range of seconds, few works have yet investigated fine-grained ANN-based power modeling. In this paper, we explore feed-forward neural networks (FFNNs) for core-level power modeling with estimation rates in the range of 10\u00a0kHz. To achieve a high estimation accuracy while minimizing run-time overhead, we propose a multi-objective-optimization of the neural architecture using NSGA-II with the FFNNs being trained on performance counter and power data from a complex-out-of-order processor architecture. We show that relative power estimation error for the highest accuracy FFNN decreases on average by 7.5% compared to a state-of-the-art linear power modeling approach and decreases by 5.5% compared to a multivariate polynomial regression model. For the FFNNs optimized for both accuracy and overhead, the average error decreases between 4.1% and 6.7% compared to linear modeling while offering significantly lower overhead compared to the highest accuracy FFNN. Furthermore, we propose a micro-controller-based and an accelerator-based implementation for run-time inference of the power modeling FFNN and show that the area overhead is negligible.<\/jats:p>","DOI":"10.1007\/s10766-022-00730-9","type":"journal-article","created":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T06:14:26Z","timestamp":1648534466000},"page":"243-266","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Fine-Grained Power Modeling of Multicore Processors Using FFNNs"],"prefix":"10.1007","volume":"50","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-7380-3793","authenticated-orcid":false,"given":"Mark","family":"Sagi","sequence":"first","affiliation":[]},{"given":"Nguyen Anh","family":"Vu Doan","sequence":"additional","affiliation":[]},{"given":"Nael","family":"Fasfous","sequence":"additional","affiliation":[]},{"given":"Thomas","family":"Wild","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Herkersdorf","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,29]]},"reference":[{"key":"730_CR1","unstructured":"ARM Limited: Cortex-M0 technical reference manual. Technical report (2009)"},{"key":"730_CR2","doi-asserted-by":"crossref","unstructured":"Bertran, R., Gonzelez, M., Martorell, X., Navarro, N., Ayguade, E.: A systematic methodology to generate decomposable and responsiv e power models for CMPs. IEEE Trans. Comput. (2013)","DOI":"10.1109\/TC.2012.97"},{"key":"730_CR3","unstructured":"Bienia, C.: Benchmarking modern multiprocessors (2011)"},{"key":"730_CR4","doi-asserted-by":"crossref","unstructured":"Bircher, W.L., John, L.K.: Complete system power estimation using processor performance events. IEEE Trans. Comput. (2012)","DOI":"10.1109\/TC.2011.47"},{"key":"730_CR5","doi-asserted-by":"crossref","unstructured":"Carlson, T.E., Heirman, W., Eyerman, S., Hur, I., Eeckhout, L.: An evaluation of high-level mechanistic core models. ACM TACO (2014)","DOI":"10.1145\/2629677"},{"key":"730_CR6","doi-asserted-by":"crossref","unstructured":"Chadha, M., Ilsche, T., Bielert, M., Nagel, W.E.: A statistical approach to power estimation for x86 processors. In: Proceedings of the 2017 IEEE 31st international parallel and distributed processing symposium workshops, IPDPSW 2017 (2017)","DOI":"10.1109\/IPDPSW.2017.98"},{"key":"730_CR7","doi-asserted-by":"crossref","unstructured":"Chen, X., Arbor, A., Dick, R.P., Mao, Z.M.: Performance and power modeling in a multi-programmed multi-core environment pp. 813\u2013818 (2010)","DOI":"10.1145\/1837274.1837479"},{"key":"730_CR8","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1007\/978-3-030-66823-5_6","volume-title":"Computer Vision: ECCV 2020 Workshops","author":"X Chu","year":"2020","unstructured":"Chu, X., Zhang, B., Xu, R.: Multi-objective reinforced evolution in mobile neural architecture search. In: Bartoli, A., Fusiello, A. (eds.) Computer Vision: ECCV 2020 Workshops, pp. 99\u2013113. Springer, Cham (2020)"},{"key":"730_CR9","doi-asserted-by":"crossref","unstructured":"Cupertino, L.F., Da Costa, G., Pierson, J.M.: Towards a generic power estimator. Comput. Sci. Res. Develop. (2014)","DOI":"10.1007\/s00450-014-0264-x"},{"key":"730_CR10","unstructured":"Deb, K., Agrawal, R.B.: Simulated binary crossover for continuous search space. Technical report (1994)"},{"key":"730_CR11","doi-asserted-by":"crossref","unstructured":"Deb, K., Agrawal, S.: A niched-penalty approach for constraint handling in genetic algorithms. In: Artificial Neural Nets and Genetic Algorithms, pp. 235\u2013243. Springer, Vienna (1999)","DOI":"10.1007\/978-3-7091-6384-9_40"},{"issue":"2","key":"730_CR12","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1109\/4235.996017","volume":"6","author":"K Deb","year":"2002","unstructured":"Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-II. IEEE Trans. Evolut. Comput. 6(2), 182\u2013197 (2002). https:\/\/doi.org\/10.1109\/4235.996017","journal-title":"IEEE Trans. Evolut. Comput."},{"key":"730_CR13","doi-asserted-by":"crossref","unstructured":"Huang, G.B., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. (2006)","DOI":"10.1109\/TNN.2006.875977"},{"key":"730_CR14","doi-asserted-by":"crossref","unstructured":"Huang, W., Lefurgy, C., Kuk, W., Buyuktosunoglu, A., Floyd, M., Rajamani, K., Allen-Ware, M., Brock, B.: Accurate fine-grained processor power proxies. In: IEEE\/ACM MICRO (2012)","DOI":"10.1109\/MICRO.2012.29"},{"key":"730_CR15","doi-asserted-by":"crossref","unstructured":"Kim, Y., Mercati, P., More, A., Shriver, E., Rosing, T.: P4: Phase-based power\/performance prediction of heterogeneous systems via neural networks. IEEE\/ACM ICCAD (2017)","DOI":"10.1109\/ICCAD.2017.8203843"},{"key":"730_CR16","doi-asserted-by":"crossref","unstructured":"Li, S., Ahn, J.H., Strong, R.D., Brockman, J.B., Tullsen, D.M., Jouppi, N.P.: McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: IEEE MICRO (2009)","DOI":"10.1145\/1669112.1669172"},{"key":"730_CR17","doi-asserted-by":"crossref","unstructured":"Lin, W., Wu, G., Wang, X., Li, K.: An artificial neural network approach to power consumption model construction for servers in cloud data centers. IEEE Trans. Sustain. Comput. (2019)","DOI":"10.1109\/TSUSC.2019.2910129"},{"key":"730_CR18","doi-asserted-by":"publisher","unstructured":"Lu, Z., Whalen, I., Boddeti, V., Dhebar, Y., Deb, K., Goodman, E., Banzhaf, W.: Nsga-net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO \u201919, pp. 419\u2013427. Association for Computing Machinery, New York, NY, USA (2019). https:\/\/doi.org\/10.1145\/3321707.3321729","DOI":"10.1145\/3321707.3321729"},{"key":"730_CR19","unstructured":"McCullough, J.C., Agarwal, Y., Chandrashekar, J., Kuppuswamy, S., Snoeren, A.C., Gupta, R.K., Diego, U.C.S., Labs, I.: Evaluating the effectiveness of model-based power characterization. Usenix Atc (2011)"},{"key":"730_CR20","doi-asserted-by":"crossref","unstructured":"M\u00f6bius, C., Dargie, W., Schill, A.: Power consumption estimation models for processors, virtual machines, and servers. IEEE TPDS (2014)","DOI":"10.1109\/TPDS.2013.183"},{"key":"730_CR21","doi-asserted-by":"crossref","unstructured":"Pathania, A., Henkel, J.: HotSniper: Sniper-based toolchain for many-core thermal simulations in open systems. IEEE Embedd. Syst. Lett. (2019)","DOI":"10.1109\/LES.2018.2866594"},{"key":"730_CR22","doi-asserted-by":"crossref","unstructured":"Rapp, M., Pathania, A., Mitra, T., Henkel, J.: Prediction-based task migration on S-NUCA many-cores. In: DATE (2019)","DOI":"10.23919\/DATE.2019.8714974"},{"key":"730_CR23","doi-asserted-by":"crossref","unstructured":"Rapp, M., Sagi, M., Pathania, A., Herkersdorf, A., Henkel, J.: Power- and cache-aware task mapping with dynamic power budgeting for many-cores. IEEE Trans. Comput. (2019)","DOI":"10.1109\/TC.2019.2935446"},{"issue":"1","key":"730_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TC.2019.2935446","volume":"69","author":"M Rapp","year":"2020","unstructured":"Rapp, M., Sagi, M., Pathania, A., Herkersdorf, A., Henkel, J.: Power- and cache-aware task mapping with dynamic power budgeting for many-cores. IEEE Trans. Comput. 69(1), 1\u201313 (2020). https:\/\/doi.org\/10.1109\/TC.2019.2935446","journal-title":"IEEE Trans. Comput."},{"key":"730_CR25","doi-asserted-by":"crossref","unstructured":"Rethinagiri, S.K., Palomar, O., Ben Atitallah, R., Niar, S., Unsal, O., Kestelman, A.C.: System-level power estimation tool for embedded processor based platforms. In: ACM RAPIDO (2014)","DOI":"10.1145\/2555486.2555491"},{"key":"730_CR26","doi-asserted-by":"publisher","first-page":"186","DOI":"10.1007\/978-3-030-60939-9_13","volume-title":"Embedded Computer Systems: Architectures, Modeling, and Simulation","author":"M Sagi","year":"2020","unstructured":"Sagi, M., Vu Doan, N.A., Fasfous, N., Wild, T., Herkersdorf, A.: Fine-grained power modeling of multicore processors using ffnns. In: Orailoglu, A., Jung, M., Reichenbach, M. (eds.) Embedded Computer Systems: Architectures, Modeling, and Simulation, pp. 186\u2013199. Springer, Cham (2020)"},{"key":"730_CR27","doi-asserted-by":"crossref","unstructured":"Samei, Y., D\u00f6mer, R.: Automated estimation of power consumption for rapid system level design. In: IEEE IPCCC (2014)","DOI":"10.1109\/PCCC.2014.7017085"},{"key":"730_CR28","doi-asserted-by":"crossref","unstructured":"Shahid, A., Fahad, M., Manumachu, R.R., Lastovetsky, A.: Improving the accuracy of energy predictive models for multicore cpus using additivity of performance monitoring counters. In: Parallel Computing Technologies (2019)","DOI":"10.1007\/978-3-030-25636-4_5"},{"key":"730_CR29","doi-asserted-by":"crossref","unstructured":"Su, B., Gu, J., Shen, L., Huang, W., Greathouse, J.L., Wang, Z.: Ppep: online performance, power, and energy prediction framework and dvfs space exploration. In: IEEE\/ACM MICRO (2014)","DOI":"10.1109\/MICRO.2014.17"},{"key":"730_CR30","doi-asserted-by":"crossref","unstructured":"Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., Vissers, K.: Finn: A framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA \u201917, pp. 65\u201374. ACM (2017)","DOI":"10.1145\/3020078.3021744"},{"key":"730_CR31","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1007\/978-3-030-63836-8_23","volume-title":"Neural Information Processing","author":"P Vidnerov\u00e1","year":"2020","unstructured":"Vidnerov\u00e1, P., Neruda, R.: Multi-objective evolution for deep neural network architecture search. In: Yang, H., Pasupa, K., Leung, A.C.S., Kwok, J.T., Chan, J.H., King, I. (eds.) Neural Information Processing, pp. 270\u2013281. Springer, Cham (2020)"},{"key":"730_CR32","doi-asserted-by":"crossref","unstructured":"Walker, M.J., Diestelhorst, S., Hansson, A., Das, A.K., Yang, S., Al-Hashimi, B.M., Merrett, G.V.: Accurate and stable run-time power modeling for mobile and embedded CPUs. IEEE TCAD (2017)","DOI":"10.1109\/TCAD.2016.2562920"},{"key":"730_CR33","doi-asserted-by":"crossref","unstructured":"Woof, S.C., Ohara, M., Torriet, E.: The Splash-2 programs: characterization and methodological considerations. In: ACM ISCA (1995)","DOI":"10.1145\/223982.223990"},{"key":"730_CR34","unstructured":"Wu, W., Lin, W., He, L., Wu, G., Hsu, C.H.: A power consumption model for cloud servers based on elman neural network. IEEE Trans. Cloud Comput. (2019)"}],"container-title":["International Journal of Parallel Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-022-00730-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10766-022-00730-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-022-00730-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,4,29]],"date-time":"2022-04-29T22:04:47Z","timestamp":1651269887000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10766-022-00730-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,29]]},"references-count":34,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,4]]}},"alternative-id":["730"],"URL":"https:\/\/doi.org\/10.1007\/s10766-022-00730-9","relation":{},"ISSN":["0885-7458","1573-7640"],"issn-type":[{"value":"0885-7458","type":"print"},{"value":"1573-7640","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,29]]},"assertion":[{"value":"15 April 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 March 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}