Abstract
Estimating the capacity of a room or venue is essential to avoid overcrowding that could compromise people’s safety. Having enough free space to guarantee a minimal safety distance between people is also essential for health reasons, as in the current COVID-19 pandemic. Already existing systems for automatic crowd counting are mostly based on image or video data, and some of them, using deep learning architectures. In this paper, we study the viability of already existing Deep Learning Crowd Counting systems and propose new alternatives based on new network architectures containing convolutional layers, exclusively based on the use of environmental audio signals. The proposed architecture is able to infer the actual capacity with a higher accuracy in comparison to previous proposals. Consequently, conclusions from the accuracy obtained with out approach are drawn and the possible scope of deep learning based crowd counting systems is discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, Q., et al.: Audiovisual crowd counting dataset (2020). https://doi.org/10.5281/zenodo.3828468
Wang, Q., et al.: Ambient sound helps: audiovisual crowd counting in extreme conditions (2020). https://arxiv.org/pdf/2005.07097.pdf
Hershey, S., et al.: CNN architectures for large-scale audio classification (2017). https://arxiv.org/pdf/1609.09430.pdf
Thomas, C.: U-Nets with ResNet Encoders and cross connections. Journal (2019). https://towardsdatascience.com/u-nets-with-resnet-encoders-and-cross-connections-d8ba94125a2c
Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes (2018). https://arxiv.org/pdf/1802.10062.pdf
Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting (2019). https://arxiv.org/pdf/1811.10452.pdf
Gorriz, J.M., et al.: Artificial intelligence within the interplay between natural and artificial computation: advances in data science, trends and applications. Neurocomputing 410, 237–270 (2020). https://doi.org/10.1016/j.neucom.2020.05.078
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks (2012). https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Wen, H., et al.: Hanning self-convolution window and its application to harmonic analysis (2009). https://doi.org/10.1007/s11431-008-0356-6
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection (2005). https://hal.inria.fr/inria-00548512/document
Viola, P., Jones, M.J.: Robust real-time face detection (2004). https://www.face-rec.org/algorithms/boosting-ensemble/16981346.pdf
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network (2016). http://people.eecs.berkeley.edu/~yima/psfile/Single-Image-Crowd-Counting.pdf
Zhang, Q., Chan, A.B.: Wide-area crowd counting via ground-plane density maps and multi-view fusion CNNs (2019). http://visal.cs.cityu.edu.hk/static/pubs/conf/cvpr19-wacc.pdf
Zhang, B., Leitner, J., Thornton, S.: Audio recognition using MEL spectrograms and convolution neural networks. http://noiselab.ucsd.edu/ECE228_2019/Reports/Report38.pdf
Acknowledgements
This work was supported by projects PGC2018-098813-B-C32 (Spanish “Ministerio de Ciencia, Innovación y Universidades”), UMA20-FEDERJA-086 (Consejería de econnomía y conocimiento, Junta de Andalucía) and by European Regional Development Funds (ERDF), as well as the BioSiP (TIC-251) research group. Work by F.J.M.M. was supported by the MICINN “Juan de la Cierva - Incorporación” IJC2019-038835-I Fellowship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Reyes-Daneri, C., Martínez-Murcia, F.J., Ortiz, A. (2022). Capacity Estimation from Environmental Audio Signals Using Deep Learning. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Adeli, H. (eds) Artificial Intelligence in Neuroscience: Affective Analysis and Health Applications. IWINAC 2022. Lecture Notes in Computer Science, vol 13258. Springer, Cham. https://doi.org/10.1007/978-3-031-06242-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-06242-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06241-4
Online ISBN: 978-3-031-06242-1
eBook Packages: Computer ScienceComputer Science (R0)