resumen
Abstract
Machine learning methods were considered efficient in identifying single nucleotide polymorphisms (SNP) underlying a trait of interest. This study aimed to construct predictive models using machine learning algorithms, to identify loci that best explain the variance in milk traits of dairy cattle. Further objectives involved validating the results by comparison with reported relevant regions and retrieving the pathways overrepresented by the genes
[ver mas...]
dc.contributor.author | Raschia, Maria Agustina | |
dc.contributor.author | Ríos, Pablo Javier | |
dc.contributor.author | Maizon, Daniel Omar | |
dc.contributor.author | Demitrio, Daniel Arturo | |
dc.contributor.author | Poli, Mario Andres | |
dc.date.accessioned | 2022-05-26T17:34:45Z | |
dc.date.available | 2022-05-26T17:34:45Z | |
dc.date.issued | 2022 | |
dc.identifier.issn | 2215-0161 | |
dc.identifier.other | https://doi.org/10.1016/j.mex.2022.101733 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12123/11954 | |
dc.identifier.uri | https://www.sciencedirect.com/science/article/pii/S2215016122001145 | |
dc.description.abstract | Machine learning methods were considered efficient in identifying single nucleotide polymorphisms (SNP) underlying a trait of interest. This study aimed to construct predictive models using machine learning algorithms, to identify loci that best explain the variance in milk traits of dairy cattle. Further objectives involved validating the results by comparison with reported relevant regions and retrieving the pathways overrepresented by the genes flanking relevant SNPs. Regression models using XGBoost (XGB), LightGBM (LGB), and Random Forest (RF) algorithms were trained using estimated breeding values for milk production (EBVM), milk fat content (EBVF) and milk protein content (EBVP) as phenotypes and genotypes on 40417 SNPs as predictor variables. To evaluate their efficiency, metrics for actual vs. predicted values were determined in validation folds (XGB and LGB) and out-of-bag data (RF). Less than 4500 relevant SNPs were retrieved for each trait. Among the genes flanking them, signaling and transmembrane transporter activities were overrepresented. The models trained: •Predicted breeding values for animals not included in the dataset. •Were efficient in identifying a subset of SNPs explaining phenotypic variation. The results obtained using XGB and LGB algorithms agreed with previous results. Therefore, the method proposed could be applied for future association studies on milk traits. | eng |
dc.format | application/pdf | es_AR |
dc.language.iso | eng | es_AR |
dc.publisher | Elsevier | es_AR |
dc.relation | info:eu-repograntAgreement/INTA/2019-PE-E6-I145-001/2019-PE-E6-I145-001/AR./Mejora genética objetiva para aumentar la eficiencia de los sistemas de producción animal. | es_AR |
dc.relation | info:eu-repograntAgreement/INTA/2019-PT-E9-I180-001/2019-PT-E9-I180-001/AR./TICs y gestión de Big Data | es_AR |
dc.relation | info:eu-repograntAgreement/INTA/2019-PT-E6-I513-001/2019-PT-E6-I513-001/AR./Plataforma de mejoramiento animal | es_AR |
dc.rights | info:eu-repo/semantics/openAccess | es_AR |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
dc.source | MethodsX 9 : 101733 (2022) | es_AR |
dc.subject | Single Nucleotide Polymorphism | eng |
dc.subject | Polimorfismo de un Solo Nucleótidos | es_AR |
dc.subject | Dairy Cattle | eng |
dc.subject | Ganado de Leche | es_AR |
dc.subject | Milk Production | eng |
dc.subject | Producción Lechera | es_AR |
dc.subject | Milk Protein | eng |
dc.subject | Proteínas de la Leche | es_AR |
dc.subject | Bioinformatics | eng |
dc.subject | Bioinformática | es_AR |
dc.subject | Loci | eng |
dc.subject.other | Milk Fat Content | eng |
dc.subject.other | Contenido de Grasa Láctea | es_AR |
dc.subject.other | Machine Learning Algorithms | eng |
dc.subject.other | Algoritmos de Aprendizaje Automático | es_AR |
dc.title | Methodology for the identification of relevant loci for milk traits in dairy cattle, using machine learning algorithms | es_AR |
dc.type | info:ar-repo/semantics/artículo | es_AR |
dc.type | info:eu-repo/semantics/article | es_AR |
dc.type | info:eu-repo/semantics/publishedVersion | es_AR |
dc.rights.license | Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) | |
dc.description.origen | Instituto de Genética | es_AR |
dc.description.fil | Fil: Raschia, Maria Agustina. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Genética; Argentina | es_AR |
dc.description.fil | Fil: Ríos, Pablo J. Universidad de Buenos Aires; Argentina | es_AR |
dc.description.fil | Fil: Ríos, Pablo J. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina | es_AR |
dc.description.fil | Fil: Maizon, Daniel Omar. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Anguil; Argentina | es_AR |
dc.description.fil | Fil: Maizon, Daniel Omar. Universidad Nacional de La Pampa. Facultad de Agronomía; Argentina | es_AR |
dc.description.fil | Fil: Demitrio, Daniel Arturo. Instituto Nacional de Tecnología Agropecuaria (INTA). Dirección General de Sistemas de Información, Comunicación y Procesos. Gerencia de Informática y Gestión de la Información; Argentina | es_AR |
dc.description.fil | Fil: Demitrio, Daniel Arturo. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina | es_AR |
dc.description.fil | Fil: Poli, Mario Andres. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Genética; Argentina | es_AR |
dc.description.fil | Fil: Poli, Mario Andres. Universidad del Salvador. Facultad de Ciencias Agrarias y Veterinaria; Argentina | es_AR |
dc.subtype | cientifico |
Files in this item
This item appears in the following Collection(s)
common
-
Artículos científicos [164]