Por favor, use este identificador para citar o enlazar este ítem:
https://hdl.handle.net/20.500.12008/54653
Cómo citar
Registro completo de metadatos
| Campo DC | Valor | Lengua/Idioma |
|---|---|---|
| dc.contributor.author | Sastre, Ignacio | - |
| dc.contributor.author | Rosá, Aiala | - |
| dc.date.accessioned | 2026-04-28T17:42:07Z | - |
| dc.date.available | 2026-04-28T17:42:07Z | - |
| dc.date.issued | 2026 | - |
| dc.identifier.citation | Sastre, I. y Rosá, A. Concept Tokens: Learning behavioral embeddings through concept definitions [Preprint] Publicado en : Computer Science (Computation and Language), arXiv:2601.04465, Jan 2026. DOI: https://doi.org/10.48550/arXiv.2601.04465. | es |
| dc.identifier.uri | https://hdl.handle.net/20.500.12008/54653 | - |
| dc.description | Aceptado para su publicación en : 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026) San Diego, California, July 2 - 7, 2026. | es |
| dc.description.abstract | We propose Concept Tokens, a lightweight method that adds a new special token to a pretrained LLM and learns only its embedding from multiple natural language definitions of a target concept, where occurrences of the concept are replaced by the new token. The LLM is kept frozen and the embedding is optimized with the standard language-modeling objective. We evaluate Concept Tokens in three settings. First, we study hallucinations in closed-book question answering on HotpotQA and find a directional effect: negating the hallucination token reduces hallucinated answers mainly by increasing abstentions, whereas asserting it increases hallucinations and lowers precision. Second, we induce recasting, a pedagogical feedback strategy for second language teaching, and observe the same directional effect. Moreover, compared to providing the full definitional corpus in-context, concept tokens better preserve compliance with other instructions (e.g., asking follow-up questions). Finally, we include a qualitative study with the Eiffel Tower and a fictional "Austral Tower" to illustrate what information the learned embeddings capture and where their limitations emerge. Overall, Concept Tokens provide a compact control signal learned from definitions that can steer behavior in frozen LLMs. | es |
| dc.description.sponsorship | Beca Maestría ANII POS_FMV_2023_1_1012622. | es |
| dc.format.extent | 18 p. | es |
| dc.format.mimetype | application/pdf | es |
| dc.language.iso | en | es |
| dc.relation.ispartof | Computer Science (Computation and Language), arXiv:2601.04465, Jan 2026. | es |
| dc.rights | Las obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014) | es |
| dc.title | Concept Tokens: Learning behavioral embeddings through concept definitions | es |
| dc.type | Preprint | es |
| dc.contributor.filiacion | Sastre Ignacio, Universidad de la República (Uruguay). Facultad de Ingeniería. | - |
| dc.contributor.filiacion | Rosá Aiala, Universidad de la República (Uruguay). Facultad de Ingeniería. | - |
| dc.rights.licence | Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0) | es |
| Aparece en las colecciones: | Publicaciones académicas y científicas - Instituto de Computación | |
Ficheros en este ítem:
| Fichero | Descripción | Tamaño | Formato | ||
|---|---|---|---|---|---|
| SR26.pdf | Preprint | 373,81 kB | Adobe PDF | Visualizar/Abrir |
Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons