Abstract
Recent research has found that the activation function (AF) plays a significant role in introducing non-linearity to enhance the performance of deep learning networks. Researchers recently started developing activation functions that can be trained throughout the learning process, known as trainable, or adaptive activation functions (AAF). Research on AAF that enhances the outcomes is still in its early stages. In this paper, a novel activation function ‘ErfReLU’ has been developed based on the erf function and ReLU. This function leverages the advantages of both the Rectified Linear Unit (ReLU) and the error function (erf). A comprehensive overview of activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained. Adaptive activation functions like Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf is also presented. Lastly, comparative performance analysis of 9 trainable activation functions namely Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf with the proposed one has been performed. These activation functions are used in MobileNet, VGG16, and ResNet models and their performance is evaluated on benchmark datasets such as CIFAR-10, MNIST, and FMNIST.
Similar content being viewed by others
Data availability
The datasets are used in article is openly available.
References
Alcaide E (2018) E-swish: Adjusting Activations to Different Network Depths. 1–13.http://arxiv.org/abs/1801.07145
Alkhouly AA, Mohammed A, Hefny HA (2021) Improving the performance of deep neural networks using two proposed activation functions. IEEE Access 9:82249–82271. https://doi.org/10.1109/ACCESS.2021.3085855
Apicella A, Donnarumma F, Isgrò F, Prevete R (2021) A survey on modern trainable activation functions. Neural Netw 138:14–32. https://doi.org/10.1016/j.neunet.2021.01.026
Bingham G, Miikkulainen R (2022) Discovering parametric activation functions. Neural Netw 148:48–65. https://doi.org/10.1016/j.neunet.2022.01.001
Biswas K, Kumar S, Banerjee S, Pandey AK (2021) TanhSoft - dynamic trainable activation functions for faster learning and better performance. IEEE Access 9:120613–120623. https://doi.org/10.1109/ACCESS.2021.3105355
Biswas K, Kumar S, Banerjee S, Pandey AK (2022) ErfAct and Pserf: non-monotonic smooth trainable activation functions. Proce AAAI Conf Artif Intell 36(6):6097–6105. https://doi.org/10.1609/aaai.v36i6.20557
Clevert D-A, Unterthiner T, and Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). In: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, pp 1–14.https://arxiv.org/abs/1511.07289
Dasgupta R, Chowdhury YS, Nanda S (2021) Performance comparison of benchmark activation function ReLU, Swish and Mish for Facial Mask Detection Using Convolutional Neural Network, pp 355–367. https://doi.org/10.1007/978-981-16-2248-9_34
Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111
Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11. https://doi.org/10.1016/j.neunet.2017.12.012
Gustineli M (2022) A survey on recently proposed activation functions for Deep Learning. http://arxiv.org/abs/2204.02921
Hao W, Yizhou W, Yaqin L and Zhili S (2020) The role of activation function in CNN. In: Proceedings - 2020 2nd International Conference on Information Technology and Computer Application, ITCA 2020, pp 429–432.https://doi.org/10.1109/ITCA52113.2020.00096
Kamalov F, Nazir A, Safaraliev M, Cherukuri AK, Zgheib R (2021) Comparative analysis of activation functions in neural networks. In: 2021 28th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2021 - Proceedings.https://doi.org/10.1109/ICECS53924.2021.9665646
Kiliçarslan S, Celik M (2021) RSigELU: a nonlinear activation function for deep neural networks. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2021.114805
Kiseľák J, Lu Y, Švihra J, Szépe P, Stehlík M (2021) “SPOCU”: scaled polynomial constant unit activation function. Neural Comput Appl 33(8):3385–3401. https://doi.org/10.1007/s00521-020-05182-1
Lau MM, Lim KH (2019) Review of adaptive activation function in deep neural network. 2018 IEEE EMBS Conference on Biomedical Engineering and Sciences, IECBES 2018 - Proceedings, pp 686–690.https://doi.org/10.1109/IECBES.2018.08626714
Maniatopoulos A, Mitianoudis N (2021) Learnable Leaky ReLU (LeLeLU): an alternative accuracy-optimized activation function. Information (Switzerland). https://doi.org/10.3390/info12120513
Misra D (2019) Mish: a self regularized non-monotonic activation function. http://arxiv.org/abs/1908.08681
Nag S, and Bhattacharyya M (2021) SERF: towards better training of deep neural networks using log-Softplus ERror activation Function.http://arxiv.org/abs/2108.09598
Paul A, Bandyopadhyay R, Yoon JH, Geem ZW, Sarkar R (2022) SinLU: sinu-sigmoidal linear unit. Mathematics. https://doi.org/10.3390/math10030337
Ramachandran P, Zoph B, and Le QV (2017) Searching for activation functions. In: 6th International Conference on Learning Representations, ICLR 2018 - Workshop Track Proceedings, pp 1–13. http://arxiv.org/abs/1710.05941
Roy SK, Manna S, Dubey SR, and Chaudhuri BB (2018) LiSHT: non-parametric linearly scaled hyperbolic tangent activation function for neural networks, pp 1–11. http://arxiv.org/abs/1901.05894
Shen SL, Zhang N, Zhou A, Yin ZY (2022) Enhancement of neural networks with an alternative activation function tanhLU. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2022.117181
Sivri TT, Akman NP, and Berkol A (2022) Multiclass classification using arctangent activation function and its variations, pp 1–6. https://doi.org/10.1109/ecai54874.2022.9847486
Wang X, Ren H, Wang A (2022) Smish: a novel activation function for deep learning methods. Electronics (Switzerland). https://doi.org/10.3390/electronics11040540
Wu L, Wang S, Fang L, Du H (2021) MMReLU: a simple and smooth activation function with high convergence speed. In: 2021 7th International Conference on Computer and Communications, ICCC 2021, pp 1444–1448. https://doi.org/10.1109/ICCC54389.2021.9674529
Zheng B, and Wang Z (2020) PATS: a new neural network activation function with parameter. In: 2020 5th International Conference on Computer and Communication Systems, ICCCS 2020, pp 125–129. https://doi.org/10.1109/ICCCS49078.2020.9118471
Zhou Y, Li D, Huo S, Kung SY (2021) Shape autotuning activation function [Formula presented]. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.114534
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rajanand, A., Singh, P. ErfReLU: adaptive activation function for deep neural network. Pattern Anal Applic 27, 68 (2024). https://doi.org/10.1007/s10044-024-01277-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10044-024-01277-w