english Icono del idioma   español Icono del idioma  

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/54653 Cómo citar
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorSastre, Ignacio-
dc.contributor.authorRosá, Aiala-
dc.date.accessioned2026-04-28T17:42:07Z-
dc.date.available2026-04-28T17:42:07Z-
dc.date.issued2026-
dc.identifier.citationSastre, I. y Rosá, A. Concept Tokens: Learning behavioral embeddings through concept definitions [Preprint] Publicado en : Computer Science (Computation and Language), arXiv:2601.04465, Jan 2026. DOI: https://doi.org/10.48550/arXiv.2601.04465.es
dc.identifier.urihttps://hdl.handle.net/20.500.12008/54653-
dc.descriptionAceptado para su publicación en : 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026) San Diego, California, July 2 - 7, 2026.es
dc.description.abstractWe propose Concept Tokens, a lightweight method that adds a new special token to a pretrained LLM and learns only its embedding from multiple natural language definitions of a target concept, where occurrences of the concept are replaced by the new token. The LLM is kept frozen and the embedding is optimized with the standard language-modeling objective. We evaluate Concept Tokens in three settings. First, we study hallucinations in closed-book question answering on HotpotQA and find a directional effect: negating the hallucination token reduces hallucinated answers mainly by increasing abstentions, whereas asserting it increases hallucinations and lowers precision. Second, we induce recasting, a pedagogical feedback strategy for second language teaching, and observe the same directional effect. Moreover, compared to providing the full definitional corpus in-context, concept tokens better preserve compliance with other instructions (e.g., asking follow-up questions). Finally, we include a qualitative study with the Eiffel Tower and a fictional "Austral Tower" to illustrate what information the learned embeddings capture and where their limitations emerge. Overall, Concept Tokens provide a compact control signal learned from definitions that can steer behavior in frozen LLMs.es
dc.description.sponsorshipBeca Maestría ANII POS_FMV_2023_1_1012622.es
dc.format.extent18 p.es
dc.format.mimetypeapplication/pdfes
dc.language.isoenes
dc.relation.ispartofComputer Science (Computation and Language), arXiv:2601.04465, Jan 2026.es
dc.rightsLas obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)es
dc.titleConcept Tokens: Learning behavioral embeddings through concept definitionses
dc.typePreprintes
dc.contributor.filiacionSastre Ignacio, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.contributor.filiacionRosá Aiala, Universidad de la República (Uruguay). Facultad de Ingeniería.-
dc.rights.licenceLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)es
Aparece en las colecciones: Publicaciones académicas y científicas - Instituto de Computación

Ficheros en este ítem:
Fichero Descripción Tamaño Formato   
SR26.pdfPreprint373,81 kBAdobe PDFVisualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons