Por favor, use este identificador para citar o enlazar este ítem:
https://hdl.handle.net/20.500.12008/51189
Cómo citar
Título: | Teacher Student Curriculum Learning applied to Optical Character Recognition : An analysis based on a case study. |
Autor: | Laguna Queirolo, Rodrigo Jorgeluis |
Tutor: | Moncecchi, Guillermo |
Tipo: | Tesis de maestría |
Palabras clave: | Curriculum Learning, Teacher Student Curriculum Learning, Optical Character Recognition, Comparative Analysis |
Fecha de publicación: | 2025 |
Resumen: | This thesis explores the application of Teacher Student Curriculum Learning
(TSCL), a Reinforcement Learning (RL) based Curriculum Learning (CL)
method, to the task of Optical Character Recognition (OCR) in a dataset
from the LUISA project. The aim of the LUISA project is to develop tools
for extracting information from digital images of historical documents stored
in the Archivo Berruti, a collection of documents generated by the Uruguayan
Armed Forces during the last dictatorship in the period 1968-1985.
The proposed approach uses a seq2seq model as the Student in the TSCL
framework, which was trained on the same data as a previously developed
model (Chavat Pérez, 2022), but with modifications to the training method.
This allows for a fair evaluation on the benefits of TSCL. This work contributes
to a better understanding of TSCL and its potential application to OCR.
Moreover, it presents a thorough theoretical review on CL, with a special
focus on RL-based methods, including TSCL, and compares its results with
traditional methods.
While TSCL results show only minimal improvements in OCR performance,
this work contributes to the understanding of TSCL’s functioning and
provides a stepping stone for future implementations in supervised tasks beyond
OCR. In an effort to compare TSCL against strong benchmarks, the
study also enhances Chavat’s work by proposing improvements in model training
with image augmentation techniques and beam search, surpassing previous
metrics reported by over 16% for Character Error Rate (CER).
The code developed for this work is publicly available. Based on available
information, this appears to be the first attempt to apply CL techniques,
specifically TSCL, to an OCR task. |
Editorial: | Udelar. FI. |
Citación: | Laguna Queirolo, R. Teacher Student Curriculum Learning applied to Optical Character Recognition : An analysis based on a case study [en línea] Tesis de maestría. Montevideo : Udelar. FI. INCO : PEDECIBA. Área Informática, 2025. |
ISSN: | 1688-2792 |
Título Obtenido: | Magíster en Informática |
Facultad o Servicio que otorga el Título: | Universidad de la República (Uruguay). Facultad de Ingeniería |
Licencia: | Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0) |
Aparece en las colecciones: | Tesis de posgrado - Instituto de Computación |
Ficheros en este ítem:
Fichero | Descripción | Tamaño | Formato | ||
---|---|---|---|---|---|
Lag25.pdf | Tesis de Maestría | 7,17 MB | Adobe PDF | Visualizar/Abrir |
Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons