english Icono del idioma   español Icono del idioma  

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12008/29961 How to cite
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZinemanas, Pablo-
dc.contributor.authorRocamora, Martín-
dc.contributor.authorFonseca, Eduardo-
dc.contributor.authorFont, Frederic-
dc.contributor.authorSerra, Xavier-
dc.date.accessioned2021-10-25T17:05:00Z-
dc.date.available2021-10-25T17:05:00Z-
dc.date.issued2021-
dc.identifier.citationZinemanas, P., Rocamora, M., Fonseca, E. y otros. Toward interpretable polyphonic sound event detection with attention maps based on local prototypes [en línea]. EN: 6th Workshop on Detection and Classification of Acoustic Scenes and Events, DCASE 2021, Barcelona, Spain, 15-19 nov. 2021, pp. 50-54.en
dc.identifier.urihttp://dcase.community/workshop2021/proceedings-
dc.identifier.urihttp://dcase.community/workshop2021/-
dc.identifier.urihttp://dcase.community/documents/workshop2021/proceedings/DCASE2021Workshop_Zinemanas_22.pdf-
dc.identifier.urihttps://hdl.handle.net/20.500.12008/29961-
dc.description.abstractUnderstanding the reasons behind the predictions of deep neural networks is a pressing concern as it can be critical in several application scenarios. In this work, we present a novel interpretable model for polyphonic sound event detection. It tackles one of the limitations of our previous work, i.e. the difficulty to deal with a multi-label setting properly. The proposed architecture incorporates a prototype layer and an attention mechanism. The network learns a set of local prototypes in the latent space representing a patch in the input representation. Besides, it learns attention maps for positioning the local prototypes and reconstructing the latent space. Then, the predictions are solely based on the attention maps. Thus, the explanations provided are the attention maps and the corresponding local prototypes. Moreover, one can reconstruct the prototypes to the audio domain for inspection. The obtained results in urban sound event detection are comparable to that of two opaque baselines but with fewer parameters while offering interpretability.en
dc.format.extent5 p.es
dc.format.mimetypeapplication/pdfes
dc.language.isoenes
dc.publisherUniversitat Pompeu Fabraen
dc.relation.ispartof6th Workshop on Detection and Classification of Acoustic Scenes and Events, DCASE 2021, Barcelona, Spain, 15-19 nov. 2021, pp. 50-54.es
dc.rightsLas obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)es
dc.subjectInterpretabilityen
dc.subjectSound event detectionen
dc.subjectPrototypesen
dc.titleToward interpretable polyphonic sound event detection with attention maps based on local prototypesen
dc.typePonenciaes
dc.contributor.filiacionZinemanas Pablo, Universitat Pompeu Fabra, Barcelona, Spain-
dc.contributor.filiacionRocamora Martín, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.contributor.filiacionFonseca Eduardo, Universitat Pompeu Fabra, Barcelona, Spain-
dc.contributor.filiacionFont Frederic, Universitat Pompeu Fabra, Barcelona, Spain-
dc.contributor.filiacionSerra Xavier, Universitat Pompeu Fabra, Barcelona, Spain-
dc.rights.licenceLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)es
udelar.academic.departmentProcesamiento de Señales-
udelar.investigation.groupProcesamiento de Audio-
Appears in Collections:Publicaciones académicas y científicas - Instituto de Ingeniería Eléctrica

Files in This Item:
File Description SizeFormat  
ZRFFS21.pdfVersión publicada723,73 kBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons