Show simple item record

resumen

Abstract
Background: It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it [ver mas...]
dc.contributor.authorMilone, Diego Humberto
dc.contributor.authorStegmayer, Georgina
dc.contributor.authorLopez, Mariana Gabriela
dc.contributor.authorKamenetzky, Laura
dc.contributor.authorCarrari, Fernando
dc.date.accessioned2019-01-18T12:45:32Z
dc.date.available2019-01-18T12:45:32Z
dc.date.issued2014-04
dc.identifier.issn1471-2105
dc.identifier.otherhttps://doi.org/10.1186/1471-2105-15-101
dc.identifier.urihttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-101
dc.identifier.urihttp://hdl.handle.net/20.500.12123/4292
dc.description.abstractBackground: It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters. Results: A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view. Conclusions: Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.eng
dc.formatapplication/pdfes_AR
dc.language.isoenges_AR
dc.publisherBMCes_AR
dc.rightsinfo:eu-repo/semantics/openAccesses_AR
dc.sourceBMC Bioinformatics 15 : 101 (2014)es_AR
dc.subjectBioinformáticaes_AR
dc.subjectBioinformaticseng
dc.subjectDatoses_AR
dc.subjectDataeng
dc.subject.otherAgrupamientoes_AR
dc.subject.otherClusteringes_AR
dc.titleImproving clustering with metabolic pathway dataes_AR
dc.typeinfo:ar-repo/semantics/artículoes_AR
dc.typeinfo:eu-repo/semantics/articlees_AR
dc.typeinfo:eu-repo/semantics/publishedVersiones_AR
dc.description.origenInstituto de Biotecnologíaes_AR
dc.description.filFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentinaes_AR
dc.description.filFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentinaes_AR
dc.description.filFil: Lopez, Mariana Gabriela. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina.es_AR
dc.description.filFil: Kamenetzky, Laura. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina.es_AR
dc.description.filFil: Carrari, Fernando Oscar. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina.es_AR
dc.subtypecientifico


Files in this item

Thumbnail

This item appears in the following Collection(s)

common

Show simple item record