iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.1145/3186332
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T17:34:37Z","timestamp":1732037677047},"reference-count":106,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,6,12]],"date-time":"2018-06-12T00:00:00Z","timestamp":1528761600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EPSRC Centre for Doctoral Training in High Performance Embedded and Distributed Systems HiPEDS","award":["EP\/L016796\/1"]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2019,5,31]]},"abstract":"In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep-learning ecosystem to provide a tunable balance between performance, power consumption, and programmability. In this article, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics, which include the supported applications, architectural choices, design space exploration methods, and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete, and in-depth evaluation of CNN-to-FPGA toolflows.<\/jats:p>","DOI":"10.1145\/3186332","type":"journal-article","created":{"date-parts":[[2018,6,12]],"date-time":"2018-06-12T18:12:29Z","timestamp":1528827149000},"page":"1-39","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":122,"title":["Toolflows for Mapping Convolutional Neural Networks on FPGAs"],"prefix":"10.1145","volume":"51","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-5181-6251","authenticated-orcid":false,"given":"Stylianos I.","family":"Venieris","sequence":"first","affiliation":[{"name":"Imperial College London, London, UK"}]},{"given":"Alexandros","family":"Kouris","sequence":"additional","affiliation":[{"name":"Imperial College London, London, UK"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-5181-6251","authenticated-orcid":false,"given":"Christos-Savvas","family":"Bouganis","sequence":"additional","affiliation":[{"name":"Imperial College London, London, UK"}]}],"member":"320","published-online":{"date-parts":[[2018,6,12]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2967413.2967430"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/LES.2017.2743247"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3123982"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917)","author":"Alemdar H.","unstructured":"H. Alemdar , V. Leroy , A. Prost-Boucle , and F. P\u00e9trot . 2017. Ternary neural networks for resource-efficient AI applications . In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917) . 2547--2554. H. Alemdar, V. Leroy, A. Prost-Boucle, and F. P\u00e9trot. 2017. Ternary neural networks for resource-efficient AI applications. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917). 2547--2554."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195664"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021738"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2514740"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195647"},{"key":"e_1_2_1_10_1","unstructured":"Andre Xian Ming Chang Aliasger Zaidy Vinayak Gokhale and Eugenio Culurciello. 2017. Compiling deep learning models for custom hardware accelerators. arXiv:1708.00117. Andre Xian Ming Chang Aliasger Zaidy Vinayak Gokhale and Eugenio Culurciello. 2017. Compiling deep learning models for custom hardware accelerators. arXiv:1708.00117."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.312"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917)","author":"Chen X.","unstructured":"X. Chen , X. Hu , H. Zhou , and N. Xu . 2017. FxpNet: Training a deep convolutional neural network in fixed-point representation . In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917) . 2494--2501. X. Chen, X. Hu, H. Zhou, and N. Xu. 2017. FxpNet: Training a deep convolutional neural network in fixed-point representation. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN\u201917). 2494--2501."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080248"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/2968826.2968968"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916)","author":"DiCecco R.","unstructured":"R. DiCecco , G. Lacey , J. Vasiljevic , P. Chow , G. Taylor , and S. Areibi . 2016. Caffeinated FPGAs: FPGA framework for convolutional neural networks . In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916) . 265--268. R. DiCecco, G. Lacey, J. Vasiljevic, P. Chow, G. Taylor, and S. Areibi. 2016. Caffeinated FPGAs: FPGA framework for convolutional neural networks. In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916). 265--268."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2545298"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature21056"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2010.5537908"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3029580.3029586"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8206247"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2017.8050809"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2017.25"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISVLSI.2016.129"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2017.2705069"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915)","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta , Ankur Agrawal , Kailash Gopalakrishnan , and Pritish Narayanan . 2015 . Deep learning with limited numerical precision . In Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915) . 1737--1746. Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915). 1737--1746."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the Workshop Contribution at International Conference on Learning Representations (ICLR\u201916)","author":"Gysel Philipp","year":"2016","unstructured":"Philipp Gysel , Mohammad Motamedi , and Soheil Ghiasi . 2016 . Hardware-oriented approximation of convolutional neural networks . In Proceedings of the Workshop Contribution at International Conference on Learning Representations (ICLR\u201916) . Philipp Gysel, Mohammad Motamedi, and Soheil Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. In Proceedings of the Workshop Contribution at International Conference on Learning Representations (ICLR\u201916)."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815968"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021745"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.30"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201916)","author":"Han Song","unstructured":"Song Han , Huizi Mao , and William J. Dally . 2016. Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding . In Proceedings of the International Conference on Learning Representations (ICLR\u201916) . Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding. In Proceedings of the International Conference on Learning Representations (ICLR\u201916)."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS\u201915)","author":"Han Song","year":"2015","unstructured":"Song Han , Jeff Pool , John Tran , and William Dally . 2015 . Learning both weights and connections for efficient neural network . In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS\u201915) . 1135--1143. Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS\u201915). 1135--1143."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/3130379.3130725"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.210171"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Huang Gao","unstructured":"Gao Huang , Zhuang Liu , Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks . In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 2261--2269. Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 2261--2269."},{"key":"e_1_2_1_37_1","unstructured":"Itay Hubara Matthieu Courbariaux Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems 29. 4107--4115. Itay Hubara Matthieu Courbariaux Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems 29. 4107--4115."},{"key":"e_1_2_1_38_1","unstructured":"Itay Hubara Matthieu Courbariaux Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Quantized neural networks: Training neural networks with low precision weights and activations. arXiv:1609.07061. Itay Hubara Matthieu Courbariaux Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Quantized neural networks: Training neural networks with low precision weights and activations. arXiv:1609.07061."},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 2014 International Conference on Field-Programmable Technology (FPT\u201914)","author":"Inggs G.","unstructured":"G. Inggs , S. Fleming , D. Thomas , and W. Luk . 2014. Is high level synthesis ready for business? A computational finance case study . In Proceedings of the 2014 International Conference on Field-Programmable Technology (FPT\u201914) . 12--19. G. Inggs, S. Fleming, D. Thomas, and W. Luk. 2014. Is high level synthesis ready for business? A computational finance case study. In Proceedings of the 2014 International Conference on Field-Programmable Technology (FPT\u201914). 12--19."},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915)","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy . 2015 . Batch normalization: Accelerating deep network training by reducing internal covariate shift . In Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915) . 448--456. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML\u201915). 448--456."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.28.88"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916)","author":"Judd P.","unstructured":"P. Judd , J. Albericio , T. Hetherington , T. M. Aamodt , and A. Moshovos . 2016. Stripes: Bit-serial deep neural network computing . In Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916) . 1--12. P. Judd, J. Albericio, T. Hetherington, T. M. Aamodt, and A. Moshovos. 2016. Stripes: Bit-serial deep neural network computing. In Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916). 1--12."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 2017 30th IEEE International System-on-Chip Conference (SOCC\u201917)","author":"Kim J. H.","unstructured":"J. H. Kim , B. Grady , R. Lian , J. Brothers , and J. H. Anderson . 2017. FPGA-based CNN inference accelerator synthesized from multi-threaded C software . In Proceedings of the 2017 30th IEEE International System-on-Chip Conference (SOCC\u201917) . 268--273. J. H. Kim, B. Grady, R. Lian, J. Brothers, and J. H. Anderson. 2017. FPGA-based CNN inference accelerator synthesized from multi-threaded C software. In Proceedings of the 2017 30th IEEE International System-on-Chip Conference (SOCC\u201917). 268--273."},{"key":"e_1_2_1_45_1","volume-title":"Flexpoint: An adaptive numerical format for efficient training of deep neural networks. In Advances in Neural Information Processing Systems 30. 1740--1750.","author":"K\u00f6ster Urs","year":"2017","unstructured":"Urs K\u00f6ster , Tristan Webb , Xin Wang , Marcel Nassar , Arjun K. Bansal , William Constable , Oguz Elibol , Stewart Hall , Luke Hornof , Amir Khosrowshahi , Carey Kloss , Ruby J. Pai , and Naveen Rao . 2017 . Flexpoint: An adaptive numerical format for efficient training of deep neural networks. In Advances in Neural Information Processing Systems 30. 1740--1750. Urs K\u00f6ster, Tristan Webb, Xin Wang, Marcel Nassar, Arjun K. Bansal, William Constable, Oguz Elibol, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, and Naveen Rao. 2017. Flexpoint: An adaptive numerical format for efficient training of deep neural networks. In Advances in Neural Information Processing Systems 30. 1740--1750."},{"key":"e_1_2_1_46_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E . Hinton . 2012 . ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. 1097--1105. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. 1097--1105."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/PROC.1987.13876"},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916)","author":"Li Huimin","year":"2016","unstructured":"Huimin Li , Xitian Fan , Li Jiao , Wei Cao , Xuegong Zhou , and Lingli Wang . 2016 . A high performance FPGA-based accelerator for large-scale convolutional neural networks . In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916) . 1--9. Huimin Li, Xitian Fan, Li Jiao, Wei Cao, Xuegong Zhou, and Lingli Wang. 2016. A high performance FPGA-based accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916). 1--9."},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH\u201913)","author":"Li Jinyu","year":"2013","unstructured":"Jinyu Li , Jian Xue , and Yifan Gong . 2013 . Restructuring of deep neural network acoustic models with singular value decomposition . In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH\u201913) . Jinyu Li, Jian Xue, and Yifan Gong. 2013. Restructuring of deep neural network acoustic models with singular value decomposition. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH\u201913)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.09.046"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299170"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201916)","author":"Liu Wei","unstructured":"Wei Liu , Dragomir Anguelov , Dumitru Erhan , Christian Szegedy , Scott Reed , Cheng-Yang Fu , and Alexander C. Berg . 2016. SSD: Single shot multibox detector . In Proceedings of the European Conference on Computer Vision (ECCV\u201916) . 21--37. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV\u201916). 21--37."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916)","author":"Liu Zhiqiang","year":"2016","unstructured":"Zhiqiang Liu , Yong Dou , Jingfei Jiang , and Jinwei Xu . 2016 . Automatic code generation of convolutional neural networks in FPGA implementation . In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916) . 61--68. Zhiqiang Liu, Yong Dou, Jingfei Jiang, and Jinwei Xu. 2016. Automatic code generation of convolutional neural networks in FPGA implementation. In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT\u201916). 61--68."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.23919\/FPL.2017.8056824"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021736"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2017.8050344"},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916)","author":"Ma Yufei","year":"2016","unstructured":"Yufei Ma , Naveen Suda , Yu Cao , Jae Sun Seo , and Sarma Vrudhula . 2016 . Scalable and modularized RTL compilation of convolutional neural networks onto FPGA . In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916) . 1--8. Yufei Ma, Naveen Suda, Yu Cao, Jae Sun Seo, and Sarma Vrudhula. 2016. Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916). 1--8."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2017.12.009"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2010-343"},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC\u201916)","author":"Motamedi M.","unstructured":"M. Motamedi , P. Gysel , V. Akella , and S. Ghiasi . 2016. Design space exploration of FPGA-based deep convolutional neural networks . In Proceedings of the 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC\u201916) . 575--580. M. Motamedi, P. Gysel, V. Akella, and S. Ghiasi. 2016. Design space exploration of FPGA-based deep convolutional neural networks. In Proceedings of the 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC\u201916). 575--580."},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3131289"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2016.7577314"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021740"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080254"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3123979"},{"key":"e_1_2_1_67_1","volume-title":"Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917)","author":"Prost-Boucle A.","unstructured":"A. Prost-Boucle , A. Bourge , F. P\u00e9trot , H. Alemdar , N. Caldwell , and V. Leroy . 2017. Scalable high-performance architecture for convolutional ternary neural networks on FPGA . In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917) . 1--7. A. Prost-Boucle, A. Bourge, F. P\u00e9trot, H. Alemdar, N. Caldwell, and V. Leroy. 2017. Scalable high-performance architecture for convolutional ternary neural networks on FPGA. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917). 1--7."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847263.2847265"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_32"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2015.7139361"},{"key":"e_1_2_1_71_1","unstructured":"Colin R. Reeves (Ed.). 1993. Modern Heuristic Techniques for Combinatorial Problems. John Wiley 8 Sons New York NY. Colin R. Reeves (Ed.). 1993. Modern Heuristic Techniques for Combinatorial Problems. John Wiley 8 Sons New York NY."},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-78890-6_1"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2016.2593647"},{"key":"e_1_2_1_75_1","volume-title":"Proceedings of the Workshop on Cognitive Architectures.","author":"Sharma Hardik","year":"2016","unstructured":"Hardik Sharma , Jongse Park , Emmanuel Amaro , Bradley Thwaites , Praneetha Kotha , Anmol Gupta , Joon Kyung Kim , Asit Mishra , and Hadi Esmaeilzadeh . 2016 . DnnWeaver: From high-level deep network models to FPGA acceleration . In Proceedings of the Workshop on Cognitive Architectures. Hardik Sharma, Jongse Park, Emmanuel Amaro, Bradley Thwaites, Praneetha Kotha, Anmol Gupta, Joon Kyung Kim, Asit Mishra, and Hadi Esmaeilzadeh. 2016. DnnWeaver: From high-level deep network models to FPGA acceleration. In Proceedings of the Workshop on Cognitive Architectures."},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195659"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2017.47"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201915)","author":"Simonyan K.","unstructured":"K. Simonyan and A. Zisserman . 2015. Very deep convolutional networks for large-scale image recognition . In Proceedings of the International Conference on Learning Representations (ICLR\u201915) . K. Simonyan and A. Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR\u201915)."},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8206285"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847263.2847276"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021744"},{"key":"e_1_2_1_85_1","volume-title":"Proceedings of the Workshop on Machine Learning on the Phone and Other Consumer Devices (MLPCD\u201917)","author":"Stylianos","unstructured":"Stylianos I. Venieris and Christos-Savvas Bouganis. 2017. fpgaConvNet: A toolflow for mapping diverse convolutional neural networks on embedded FPGAs . In Proceedings of the Workshop on Machine Learning on the Phone and Other Consumer Devices (MLPCD\u201917) . Stylianos I. Venieris and Christos-Savvas Bouganis. 2017. fpgaConvNet: A toolflow for mapping diverse convolutional neural networks on embedded FPGAs. In Proceedings of the Workshop on Machine Learning on the Phone and Other Consumer Devices (MLPCD\u201917)."},{"key":"e_1_2_1_86_1","volume-title":"Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM\u201916)","author":"Stylianos","unstructured":"Stylianos I. Venieris and Christos-Savvas Bouganis. 2016. fpgaConvNet: A framework for mapping convolutional neural networks on FPGAs . In Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM\u201916) . 40--47. Stylianos I. Venieris and Christos-Savvas Bouganis. 2016. fpgaConvNet: A framework for mapping convolutional neural networks on FPGAs. In Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM\u201916). 40--47."},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021791"},{"key":"e_1_2_1_88_1","volume-title":"Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917)","author":"Stylianos","unstructured":"Stylianos I. Venieris and Christos-Savvas Bouganis. 2017. Latency-driven design for FPGA-based convolutional neural networks . In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917) . 1--8. Stylianos I. Venieris and Christos-Savvas Bouganis. 2017. Latency-driven design for FPGA-based convolutional neural networks. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL\u201917). 1--8."},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2587640"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2898003"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062207"},{"key":"e_1_2_1_92_1","unstructured":"Wei Wen Cong Xu Feng Yan Chunpeng Wu Yandan Wang Yiran Chen and Hai Li. 2017. TernGrad: Ternary gradients to reduce communication in distributed deep learning. In Advances in Neural Information Processing Systems 30. 1508--1518. Wei Wen Cong Xu Feng Yan Chunpeng Wu Yandan Wang Yiran Chen and Hai Li. 2017. TernGrad: Ternary gradients to reduce communication in distributed deep learning. In Advances in Neural Information Processing Systems 30. 1508--1518."},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/3174243.3174265"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1109\/RECONFIG.2017.8279792"},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1145\/2966986.2967011"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684746.2689060"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021727"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021698"},{"key":"e_1_2_1_102_1","volume-title":"Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916)","author":"Zhang S.","unstructured":"S. Zhang , Z. Du , L. Zhang , H. Lan , S. Liu , L. Li , Q. Guo , T. Chen , and Y. Chen . 2016. Cambricon-X: An accelerator for sparse neural networks . In Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916) . 1--12. S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen. 2016. Cambricon-X: An accelerator for sparse neural networks. In Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916). 1--12."},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021741"},{"key":"e_1_2_1_104_1","volume-title":"Proceedings of the 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures, and Processors (ASAP\u201916)","author":"Zhao Wenlai","year":"2016","unstructured":"Wenlai Zhao , Haohuan Fu , Wayne Luk , Teng Yu , Shaojun Wang , Bo Feng , Yuchun Ma , and Guangwen Yang . 2016 . F-CNN: An FPGA-based framework for training convolutional neural networks . In Proceedings of the 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures, and Processors (ASAP\u201916) . 107--114. Wenlai Zhao, Haohuan Fu, Wayne Luk, Teng Yu, Shaojun Wang, Bo Feng, Yuchun Ma, and Guangwen Yang. 2016. F-CNN: An FPGA-based framework for training convolutional neural networks. In Proceedings of the 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures, and Processors (ASAP\u201916). 107--114."},{"key":"e_1_2_1_105_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201917)","author":"Zhou Aojun","year":"2017","unstructured":"Aojun Zhou , Anbang Yao , Yiwen Guo , Lin Xu , and Yurong Chen . 2017 . Incremental network quantization: Towards lossless CNNs with low-precision weights . In Proceedings of the International Conference on Learning Representations (ICLR\u201917) . Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless CNNs with low-precision weights. In Proceedings of the International Conference on Learning Representations (ICLR\u201917)."},{"key":"e_1_2_1_106_1","unstructured":"Shuchang Zhou Zekun Ni Xinyu Zhou He Wen Yuxin Wu and Yuheng Zou. 2016. DoReFa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv:1601.06160. Shuchang Zhou Zekun Ni Xinyu Zhou He Wen Yuxin Wu and Yuheng Zou. 2016. DoReFa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv:1601.06160."},{"key":"e_1_2_1_107_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201917)","author":"Zhu Chenzhuo","unstructured":"Chenzhuo Zhu , Song Han , Huizi Mao , and William J. Dally . 2017. Trained ternary quantization . In Proceedings of the International Conference on Learning Representations (ICLR\u201917) . Chenzhuo Zhu, Song Han, Huizi Mao, and William J. Dally. 2017. Trained ternary quantization. In Proceedings of the International Conference on Learning Representations (ICLR\u201917)."}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3186332","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,31]],"date-time":"2022-12-31T19:18:34Z","timestamp":1672514314000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3186332"}},"subtitle":["A Survey and Future Directions"],"short-title":[],"issued":{"date-parts":[[2018,6,12]]},"references-count":106,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2019,5,31]]}},"alternative-id":["10.1145\/3186332"],"URL":"https:\/\/doi.org\/10.1145\/3186332","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,6,12]]},"assertion":[{"value":"2017-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-06-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}