Data efficient deep learning models for text classification

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/34000 Cómo citar

Registro completo de metadatos

Campo DC	Valor	Lengua/Idioma
dc.contributor.advisor	Moncecchi, Guillermo	-
dc.contributor.advisor	Wonsever, Dina	-
dc.contributor.author	Garreta, Raúl	-
dc.date.accessioned	2022-09-28T14:48:48Z	-
dc.date.available	2022-09-28T14:48:48Z	-
dc.date.issued	2020	-
dc.identifier.citation	Garreta, R. Data efficient deep learning models for text classification [en línea] Tesis de maestría. Montevideo : Udelar. FI. INCO : PEDECIBA. Área Informática, 2020.	es
dc.identifier.issn	1688-2792	-
dc.identifier.uri	https://hdl.handle.net/20.500.12008/34000	-
dc.description.abstract	Text classification is one of the most important techniques within natural language processing. Applications range from topic detection and intent identification to sentiment analysis. Usually text classification is formulated as a supervised learning problem, where a labeled training set is fed into a machine learning algorithm. In practice, training supervised machine learning algorithms such as those present in deep learning, require large training sets which involves a considerable amount of human labor to manually tag the data. This constitutes a bottleneck in applied supervised learning, and as a result, it is desired to have supervised learning models that require smaller amounts of tagged data. In this work, we will research and compare supervised learning models for text classification that are data efficient, that is, require small amounts of tagged data to achieve state of the art performance levels. In particular, we will study transfer learning techniques that reuse previous knowledge to train supervised learning models. For the purpose of comparison, we will focus on opinion polarity classification, a sub problem within sentiment analysis that assigns polarity to an opinion (positive or negative) depending on the mood of the opinion holder. Multiple deep learning models to learn representations of texts including BERT, InferSent, Universal Sentence Encoder and the Sentiment Neuron are compared in six datasets from different domains. Results show that transfer learning dramatically improves data efficiency, obtaining double digit improvements in F1 score just with under 100 supervised training examples.	es
dc.format.extent	108 p.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	en	es
dc.publisher	Udelar. FI.	es
dc.rights	Las obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)	es
dc.subject	Text classification	es
dc.subject	Natural language processing	es
dc.subject	Sentiment analysis	es
dc.subject	Deep learning	es
dc.subject	Transfer learning	es
dc.title	Data efficient deep learning models for text classification	es
dc.type	Tesis de maestría	es
dc.contributor.filiacion	Garreta Raúl, Universidad de la República (Uruguay). Facultad de Ingeniería.	-
thesis.degree.grantor	Universidad de la República (Uruguay). Facultad de Ingeniería	es
thesis.degree.name	Magíster en Informática	es
dc.rights.licence	Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)	es
Aparece en las colecciones:	Tesis de posgrado - Instituto de Computación

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
Gar20.pdf	Tesis de Maestría	5,27 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro sencillo del ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons