Something old, something new, something borrowed : Evaluation of different neural network architectures for genomic prediction

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/36884 Cómo citar

Título:	Something old, something new, something borrowed : Evaluation of different neural network architectures for genomic prediction
Autor:	Fariello, Maria Inés Arboleya, Lucía Belzarena, Diego De Los Santos, Leonardo Elenter, Juan Etchebarne, Guillermo Hounie, Ignacio Ciappesoni, Gabriel Navajas, Elly Lecumberry, Federico
Tipo:	Póster
Palabras clave:	Predicción genómica, Deep learning, Signal processing
Fecha de publicación:	2023
Resumen:	Genome enabled prediction of complex traits aims to predict a measurable characteristic of an organism using their genetic information. In the present work we address diverse traits and organisms including Yeast growth, Wheat yield, Jersey bull fertility and various Holstein cattle milk-related traits. We benchmark several popular Machine Learning models: Bayesian and penalized linear regressions, kernel methods, and Decision Tree ensembles. Through exhaustive hyperparameter tuning we outperform state-of-the-art results in most datasets. We also evaluate two codification techniques for input data and perform ablation studies to assess robustness to genetic markers - i.e input features- elimination. We also explore different Deep Learning architectures for this task. We propose and evaluate Convolutional Neural Network (CNN) architectures, showing that using residual connections improves performance but that in some cases Fully Connected Networks outperform CNNs. We link this to the fact that absolute positions are relevant in genomes, and thus, CNN’s translational equivariance may not be an adequate inductive bias for tackling this problem. We evaluate Graph Neural Network (GNN) architectures by formulating trait prediction as a node regression problem on a population graph, where each node represents an individual, and edges association between their genetic information. We evaluate the transferability of these graphical models and find that the extent to which they exploit neighborhood information is limited. By combining CNN and GNN architectures, we could outperform all other models for predicting milk yield in Holstein cattle.The methods that are based on neural networks can be computationally demanding when used on high density chips or sequence data, even more when fully connected layers are used. To overcome this problem, we propose to obtain a new representation of the input vector by using the intermediate representation (code) of an Autoencoder (AE). Currently we are evaluating the performance benchmarks. Another common issue when using these databases is the missing data or the combination of chips of different SNP's numbers. Again, we propose to use AE for imputing the missing values. One of the main focuses of this work was to explore the feasibility of employing modern deep learning architectures in Genomic Prediction. In this regard, it was possible to train highly over-parameterized architectures and still obtain good generalization. For some datasets and traits, these models outperform all others. However, this did not hold for all the models, traits and datasets studied. Besides, whether the gains in performance outweigh the increase in model size and thus its training and inference computational cost, and lack of interpretability, calls for further discussion.
Editorial:	Plant and Animal Genome Conference (PAG)
EN:	Plant & Animal Genome Conference : PAG 30, San Diego, California, USA, 13-18 jan. 2023.
Financiadores:	Este trabajo fue parcialmente financiado por la Universidad de la República y el proyecto ANII FDA 1_2018_1_154364
Citación:	Fariello, M., Arboleya, L., Belzarena, D. y otros. Something old, something new, something borrowed : Evaluation of different neural network architectures for genomic prediction. [en línea]. Póster, 2023.
Licencia:	Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
Aparece en las colecciones:	Publicaciones académicas y científicas - Instituto de Ingeniería Eléctrica

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
FABDEEHCNL23.pdf	Versión definitiva	2,85 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro Dublin Core completo del ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons