english Icono del idioma   español Icono del idioma  

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/29755 Cómo citar
Título: LETEO: Scalable anonymization of big data and its application to learning analytics
Autor: Giménez, Eduardo
Etcheverry, Lorena
Olmedo, Federico
Buil Aranda, Carlos
Toro, Matías
Pastorini, Marcos
Tipo: Reporte técnico
Palabras clave: Anonymization, Big data, Learning analytics
Cobertura geográfica: Uruguay.
Fecha de publicación: 2021
Resumen: Created in 2007, Plan Ceibal is an inclusion and equal opportunities plan with the aim of supporting Uruguayan educational policies with technology. Throughout these years, and within the framework of its tasks, Ceibal has an important amount of data related to the use of technology in education, necessary to manage the plan and fulfill the assigned legal tasks. However, the data does not they can be studied without accounting for the problem of de identifying the users of the Plan. To exploit this data, Ceibal has deployed an instance of the Hortonworks Data Platform (HDP), a open source platform for the storage and parallel processing of massive data (big data). HDP offers a wide range of functional components ranging from large file storage (HDFS) to distributed programming of machine learning algorithms (Apache Spark / MLlib). However, as of today there are no solutions for the de-identification of personal code data open and integrated into the Hortonworks ecosystem. On the one hand, the deidentification tools existing data have not been designed so that they can easily scale to large volumes of data, and they also do not offer easy integration mechanisms with HDFS. This forces you to export the data outside of the platform that stores them to be able to anonymize them, with the consequent risk of exposure of confidential information. On the other hand, the few integrated solutions in the Hortonworks ecosystem are owners and the cost of their licenses is very significant. The objective of this project is to promote the use of the enormous amount of educational and technological data that Ceibal possesses, lifting one of the greatest obstacles that exist for that, namely, the preservation of privacy and the protection of the personal data of the beneficiaries of the Plan. To this end, this project seeks to generate anonymization tools that extend the HDP platform. On In particular, it seeks to develop open source modules to integrate into said platform, which implement a set of programmed anonymization techniques and algorithms in a distributed manner using Apache Spark and that can be applied to data sets stored in HDFS files.
Descripción: ANII Fondo sectorial de investigación con datos - 2018
Editorial: Udelar. FI.
Citación: Giménez, E., Etcheverry, L., Olmedo, F. y otros. LETEO: Scalable anonymization of big data and its application to learning analytics [en línea]. Montevideo : Udelar. FI.,2021.
Licencia: Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
Aparece en las colecciones: Reportes Técnicos - Instituto de Computación

Ficheros en este ítem:
Fichero Descripción Tamaño Formato   
GEOBTP21.pdf785,02 kBAdobe PDFVisualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons