Detection and localization of audio event for home surveillance using CRNN
Abstract
Safety and security have been a prime priority in people’s lives, and having a surveillance system at home keeps people and their property more secured. In this paper, an audio surveillance system has been proposed that does both the detection and localization of the audio or sound events. The combined task of detecting and localizing the audio events is known as Sound Event Localization and detection (SELD). The SELD in this work is executed through Convolutional Recurrent Neural Network (CRNN) architecture. CRNN is a stacked layer of convolutional neural network (CNN), recurrent neural network (RNN) and fully connected neural network (FNN). The CRNN takes multichannel audio as input, extracts features and does the detection and localization of the input audio events in parallel. The SELD results obtained by CRNN with the gated recurrent unit (GRU) and with long short-term memory (LSTM) unit are compared and discussed in this paper. The SELD results of CRNN with LSTM unit gives 75% F1 score and 82.8% frame recall for one overlapping sound. Therefore, the proposed audio surveillance system that uses LSTM unit produces better detection and overall performance for one overlapping sound.References
UNODC: United Nations Office on Drugs and Crimes, "Burglary | Statistics and data, "2017. [Online]. Available:https://dataunodc .un.org/crime/burglary.
K. Lashmi and A. S. Pillai, "Ambient Intelligence and IoT Based Decision Support System for Intruder Detection," 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), Coimbatore, India, 2019, pp. 1-4. DO I:10.1109/ICECCT.2019.8869327
Dr. Prakash P, Suresh, R., and Kumar, P. N. Dhinesh, “Smart City Video Surveillance using Fog Computing”, in International Journal of Enterprise Network Management, vol. 10, no. 3/4, pp.389 – 399, 2019. DOI: 10.1504/IJENM.2019.103165
Caught on camera, "Different Types of CCTV-CCTV Camera Types and Uses, "2020. [Online]. Available: https://www.caughtoncamera.net/ne ws/ different-types-of-cctv/.
S.Ntalampiras, "Audio Surveillance, "2012. [pdf]. Available: https://www. itpress.com/Secure/elibrary/papers/9781845645625/978184 5645625012FU1.pdf
P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio and M. Vento, "Audio Surveillance of Roads: A System for Detecting Anomalous Sounds," in IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 1, pp. 279-288, Jan. 2016. DOI: 10.1109/TITS.2015.2470216
S. Ntalampiras, I. Potamitis and N. Fakotakis, "Probabilistic Novelty Detection for Acoustic Surveillance Under Real-World Conditions," in IEEE Transactions on Multimedia, vol. 13, no. 4, pp. 713-719, Aug. 2011. DOI: 10.1109/TMM.2011.2122247
A. Mesaros et al., "Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 2, pp. 379-393, Feb. 2018. DOI: 10.1109/TASLP.2017.2778423
E. Çakır, G. Parascandolo, T. Heittola, H. Huttunen and T. Virtanen, "Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 6, pp. 1291-1303, June 2017. DOI:10.1109/T ASLP.2017.2690575
S. Adavanne, P. Pertilä and T. Virtanen, "Sound event detection using spatial features and convolutional recurrent neural network," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 771-775. DOI:10.1109/ICA SSP.2017.7952260
P. Zinemanas, P. Cancela and M. Rocamora, "End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments," 2019 24th Conference of Open Innovations Association (FRUCT), Moscow, Russia, 2019, pp. 533-539. DOI:10.23919/FRUCT.2019.871 1906
G. Parascandolo, H. Huttunen and T. Virtanen, "Recurrent neural networks for polyphonic sound event detection in real-life recordings," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 6440-6444. DOI:10.11 09/ICASSP.2016.7472917
L. Birnie, T. D. Abhayapala, H. Chen and P. N. Samarasinghe, "Sound Source Localization in a Reverberant Room Using Harmonic Based Music," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 651-655. DOI: 10.1109/ICASSP.2019.8683098
L. O. Nunes et al., "A Steered-Response Power Algorithm Employing Hierarchical Search for Acoustic Source Localization Using Microphone Arrays," in IEEE Transactions on Signal Processing, vol. 62, no. 19, pp. 5171-5183, Oct.1, 2014. DOI: 10.1109/TSP.2014.2336636
M. W. Hansen, J. R. Jensen and M. G. Christensen, "Pitch and TDOA-based localization of acoustic sources with distributed arrays," 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, 2015, pp. 2664-2668. DOI: 10.1109/ICASSP.2015.7178454
J. Pak and J. W. Shin, "Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 8, pp. 1335-1345, Aug. 2019. DOI: 10.1109/TASLP.2019.2919378
S. Adavanne, A. Politis and T. Virtanen, "Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network," 2018 26th European Signal Processing Conference (EUSIPCO), Rome, 2018, pp. 1462-1466. DOI: 10.23919/EUSIP CO.2018.8553182
S. Adavanne, A. Politis, J. Nikunen and T. Virtanen, "Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks," in IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 1, pp. 34-48, March 2019. DOI:10.110 9/JSTSP.2018.2885636
T. Butko, F. G. Pla, C. Segura, C. Nadeu and J. Hernando, "Two-source acoustic event detection and localization: Online implementation in a Smart-room," 2011 19th European Signal Processing Conference, Barcelona, 2011, pp.1317-1321.
A. Mesaros, T. Heittola, and T. Virtanen, “Metrics for polyphonic sound event detection,” Applied Sciences, vol. 6, no. 6, pp. 162–178, 2016. DOI: 10.3390/app6060162
Downloads
Published
Issue
Section
License
Copyright (c) 2021 International Journal of Electronics and Telecommunications

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on https://creativecommons.org/licenses/by/4.0/.
2. Author’s Warranties
The author warrants that the article is original, written by stated author/s, has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author/s. The undersigned also warrants that the manuscript (or its essential substance) has not been published other than as an abstract or doctorate thesis and has not been submitted for consideration elsewhere, for print, electronic or digital publication.
3. User Rights
Under the Creative Commons Attribution license, the author(s) and users are free to share (copy, distribute and transmit the contribution) under the following conditions: 1. they must attribute the contribution in the manner specified by the author or licensor, 2. they may alter, transform, or build upon this work, 3. they may use this contribution for commercial purposes.
4. Rights of Authors
Authors retain the following rights:
- copyright, and other proprietary rights relating to the article, such as patent rights,
- the right to use the substance of the article in own future works, including lectures and books,
- the right to reproduce the article for own purposes, provided the copies are not offered for sale,
- the right to self-archive the article
- the right to supervision over the integrity of the content of the work and its fair use.
5. Co-Authorship
If the article was prepared jointly with other authors, the signatory of this form warrants that he/she has been authorized by all co-authors to sign this agreement on their behalf, and agrees to inform his/her co-authors of the terms of this agreement.
6. Termination
This agreement can be terminated by the author or the Journal Owner upon two months’ notice where the other party has materially breached this agreement and failed to remedy such breach within a month of being given the terminating party’s notice requesting such breach to be remedied. No breach or violation of this agreement will cause this agreement or any license granted in it to terminate automatically or affect the definition of the Journal Owner. The author and the Journal Owner may agree to terminate this agreement at any time. This agreement or any license granted in it cannot be terminated otherwise than in accordance with this section 6. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.
7. Royalties
This agreement entitles the author to no royalties or other fees. To such extent as legally permissible, the author waives his or her right to collect royalties relative to the article in respect of any use of the article by the Journal Owner or its sublicensee.
8. Miscellaneous
The Journal Owner will publish the article (or have it published) in the Journal if the article’s editorial process is successfully completed and the Journal Owner or its sublicensee has become obligated to have the article published. Where such obligation depends on the payment of a fee, it shall not be deemed to exist until such time as that fee is paid. The Journal Owner may conform the article to a style of punctuation, spelling, capitalization and usage that it deems appropriate. The Journal Owner will be allowed to sublicense the rights that are licensed to it under this agreement. This agreement will be governed by the laws of Poland.
By signing this License, Author(s) warrant(s) that they have the full power to enter into this agreement. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.