english Icono del idioma   español Icono del idioma  

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/51189 Cómo citar
Título: Teacher Student Curriculum Learning applied to Optical Character Recognition : An analysis based on a case study.
Autor: Laguna Queirolo, Rodrigo Jorgeluis
Tutor: Moncecchi, Guillermo
Tipo: Tesis de maestría
Palabras clave: Curriculum Learning, Teacher Student Curriculum Learning, Optical Character Recognition, Comparative Analysis
Fecha de publicación: 2025
Resumen: This thesis explores the application of Teacher Student Curriculum Learning (TSCL), a Reinforcement Learning (RL) based Curriculum Learning (CL) method, to the task of Optical Character Recognition (OCR) in a dataset from the LUISA project. The aim of the LUISA project is to develop tools for extracting information from digital images of historical documents stored in the Archivo Berruti, a collection of documents generated by the Uruguayan Armed Forces during the last dictatorship in the period 1968-1985. The proposed approach uses a seq2seq model as the Student in the TSCL framework, which was trained on the same data as a previously developed model (Chavat Pérez, 2022), but with modifications to the training method. This allows for a fair evaluation on the benefits of TSCL. This work contributes to a better understanding of TSCL and its potential application to OCR. Moreover, it presents a thorough theoretical review on CL, with a special focus on RL-based methods, including TSCL, and compares its results with traditional methods. While TSCL results show only minimal improvements in OCR performance, this work contributes to the understanding of TSCL’s functioning and provides a stepping stone for future implementations in supervised tasks beyond OCR. In an effort to compare TSCL against strong benchmarks, the study also enhances Chavat’s work by proposing improvements in model training with image augmentation techniques and beam search, surpassing previous metrics reported by over 16% for Character Error Rate (CER). The code developed for this work is publicly available. Based on available information, this appears to be the first attempt to apply CL techniques, specifically TSCL, to an OCR task.
Editorial: Udelar. FI.
Citación: Laguna Queirolo, R. Teacher Student Curriculum Learning applied to Optical Character Recognition : An analysis based on a case study [en línea] Tesis de maestría. Montevideo : Udelar. FI. INCO : PEDECIBA. Área Informática, 2025.
ISSN: 1688-2792
Título Obtenido: Magíster en Informática
Facultad o Servicio que otorga el Título: Universidad de la República (Uruguay). Facultad de Ingeniería
Licencia: Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
Aparece en las colecciones: Tesis de posgrado - Instituto de Computación

Ficheros en este ítem:
Fichero Descripción Tamaño Formato   
Lag25.pdfTesis de Maestría7,17 MBAdobe PDFVisualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons