english Icono del idioma   español Icono del idioma  

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/46271 Cómo citar
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorZinemanas, Pablo-
dc.contributor.authorCancela, Pablo-
dc.contributor.authorRocamora, Martín-
dc.date.accessioned2024-10-11T15:13:04Z-
dc.date.available2024-10-11T15:13:04Z-
dc.date.issued2019-
dc.identifier.citationZinemanas, P., Cancela, P. y Rocamora, M. End–to–end convolutional neural networks for sound event detection in urban environments [en línea]. EN: Proceedings of the 24th Conference of Open Innovations Association FRUCT, 2nd IEEE FRUCT International Workshop on Semantic Audio and the Internet of Things, Moscow, Russia, 8-12 apr. 2019, pp. 533-539.es
dc.identifier.issn2305-7254-
dc.identifier.urihttps://www.fruct.org/publications/volume-24/fruct24/-
dc.identifier.urihttps://hdl.handle.net/20.500.12008/46271-
dc.descriptionFRUCT Proceedings, vol. 24, no. 1.es
dc.description.abstractWe present a novel approach to tackle the problem of sound event detection (SED) in urban environments using end-to-end convolutional neural networks (CNN). It consists of a 1D CNN for extracting the energy on mel–frequency bands from the audio signal based on a simple filter bank, followed by a 2D CNN for the classification task. The main goal of this two-stage architecture is to bring more interpretability to the first layers of the network and to permit their reutilization in other problems of same the domain. We present a novel model to calculate the mel–spectrogam using a neural network that outperforms an existing work, both in its simplicity and its matching performance. Also,we implement a recently proposed approach to normalize the energy of the mel–spectrogram (per channel energy normalization, PCEN) as a layer of the neural network. We show how the parameters of this normalization can be learned by the network and why this is useful for SED on urban environments. We study how the training modifies the filter bank as well as the PCEN normalization parameters. The obtained system achieves classification results that are comparable to the state–of–the–art, while decreasing the number of parameters involvedes
dc.format.extent7 p.es
dc.format.mimetypeapplication/pdfes
dc.language.isoenes
dc.publisherOpen Innovations Association FRUCTes
dc.relation.ispartofProceedings of the 24th Conference of Open Innovations Association FRUCT, 2nd IEEE FRUCT International Workshop on Semantic Audio and the Internet of Things, Moscow, Russia, 8-12 apr. 2019, pp. 533--539.es
dc.rightsLas obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)es
dc.subjectSound Event Detection (SED)es
dc.subjectConvolutional Neural Networks (CNN)es
dc.subjectPer Channel Energy Normalization (PCEN)es
dc.titleEnd–to–end convolutional neural networks for sound event detection in urban environments.es
dc.typePonenciaes
dc.contributor.filiacionZinemanas Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.contributor.filiacionCancela Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.contributor.filiacionRocamora Martín, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.rights.licenceLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)es
udelar.academic.departmentProcesamiento de Señaleses
udelar.investigation.groupProcesamiento de Audio (GPA)es
Aparece en las colecciones: Publicaciones académicas y científicas - IMERL (Instituto de Matemática y Estadística Rafael Laguardia)
Publicaciones académicas y científicas - Instituto de Ingeniería Eléctrica

Ficheros en este ítem:
Fichero Descripción Tamaño Formato   
ZCR19.pdfVersión publicada528,39 kBAdobe PDFVisualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons