Analysis of machine learning algorithms for violence detection in audio

doi:10.1007/978-3-031-18697-4_17

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/86310

Registo completo

Campo DC	Valor	Idioma
dc.contributor.author	Veloso, Bruno	por
dc.contributor.author	Durães, Dalila	por
dc.contributor.author	Novais, Paulo	por
dc.date.accessioned	2023-09-06T11:00:37Z	-
dc.date.issued	2022	-
dc.identifier.citation	Veloso, B., Durães, D., Novais, P. (2022). Analysis of Machine Learning Algorithms for Violence Detection in Audio. In: González-Briones, A., et al. Highlights in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection. PAAMS 2022. Communications in Computer and Information Science, vol 1678. Springer, Cham. https://doi.org/10.1007/978-3-031-18697-4_17	por
dc.identifier.isbn	978-3-031-18696-7	-
dc.identifier.issn	1865-0929	-
dc.identifier.uri	https://hdl.handle.net/1822/86310	-
dc.description.abstract	Violence has always been part of humanity, however, there are different types of violence, with physical violence being the most recurrent in our daily lives. This type of violence increasingly affects many people’s lives, so it is essential to try to combat violence. In recent years, human action recognition has been extensively studied, but mainly in video, an important computer vision area. Audio appears as a factor capable of circumventing these problems. Audio sensors can be omnidirectional, requiring less processing power and hardware and software performance when compared to the video. The audio can represent emotions. It is not affected by lighting or temperature problems, nor does it need to be at a favourable angle to capture the intended information. That said, audio is seen as the best way to recognize violence, applied with Machine Learning/Deep Learning/Transfer Learning techniques. In this paper we test a Convolutional Neural Network (CNN), a ResNet50, VGG16 and VGG19, in order to classify audios. Later we see that CNN obtains the best results, with a 92.44% accuracy in the test set. ResNet50 was the worst model used, obtaining an 86.34% accuracy. For the VGG models, both show a good potential but did not get better results than CNN.	por
dc.description.sponsorship	This work is supported by: FCT Fundação para a Ciência e Tecnologia within the RD Units Project Scope: UIDB/00319/2020.	por
dc.language.iso	eng	por
dc.publisher	Springer, Cham	por
dc.relation	info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F00319%2F2020/PT	por
dc.rights	restrictedAccess	por
dc.subject	Audio action recognition	por
dc.subject	Audio violence detection	por
dc.subject	Deep learning	por
dc.subject	Transfer learning	por
dc.title	Analysis of machine learning algorithms for violence detection in audio	por
dc.type	conferencePaper	por
dc.peerreviewed	yes	por
dc.relation.publisherversion	https://link.springer.com/chapter/10.1007/978-3-031-18697-4_17	por
oaire.citationStartPage	210	por
oaire.citationEndPage	221	por
oaire.citationVolume	1678 CCIS	por
dc.date.updated	2023-08-01T00:05:47Z	-
dc.identifier.doi	10.1007/978-3-031-18697-4_17	por
dc.date.embargo	10000-01-01	-
dc.identifier.eisbn	978-3-031-18697-4	-
sdum.export.identifier	12674	-
sdum.journal	Communications in Computer and Information Science	por
oaire.version	AM	por
Aparece nas coleções:	CAlg - Artigos em livros de atas/Papers in proceedings

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
PAAMS22_paper_8476.pdf Acesso restrito!		2,79 MB	Adobe PDF	Ver/Abrir

Ver registo simples Sugerir correção Estatísticas

Citations

Altmetrics