Publication:
Leukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level.

dc.contributor.authorCastillo, Daniel
dc.contributor.authorGalvez, Juan Manuel
dc.contributor.authorHerrera, Luis J
dc.contributor.authorRojas, Fernando
dc.contributor.authorValenzuela, Olga
dc.contributor.authorCaba, Octavio
dc.contributor.authorPrados, Jose
dc.contributor.authorRojas, Ignacio
dc.date.accessioned2023-01-25T10:30:48Z
dc.date.available2023-01-25T10:30:48Z
dc.date.issued2019-02-12
dc.description.abstractIn more recent years, a significant increase in the number of available biological experiments has taken place due to the widespread use of massive sequencing data. Furthermore, the continuous developments in the machine learning and in the high performance computing areas, are allowing a faster and more efficient analysis and processing of this type of data. However, biological information about a certain disease is normally widespread due to the use of different sequencing technologies and different manufacturers, in different experiments along the years around the world. Thus, nowadays it is of paramount importance to attain a correct integration of biologically-related data in order to achieve genuine benefits from them. For this purpose, this work presents an integration of multiple Microarray and RNA-seq platforms, which has led to the design of a multiclass study by collecting samples from the main four types of leukemia, quantified at gene expression. Subsequently, in order to find a set of differentially expressed genes with the highest discernment capability among different types of leukemia, an innovative parameter referred to as coverage is presented here. This parameter allows assessing the number of different pathologies that a certain gen is able to discern. It has been evaluated together with other widely known parameters under assessment of an ANOVA statistical test which corroborated its filtering power when the identified genes are subjected to a machine learning process at multiclass level. The optimal tuning of gene extraction evaluated parameters by means of this statistical test led to the selection of 42 highly relevant expressed genes. By the use of minimum-Redundancy Maximum-Relevance (mRMR) feature selection algorithm, these genes were reordered and assessed under the operation of four different classification techniques. Outstanding results were achieved by taking exclusively the first ten genes of the ranking into consideration. Finally, specific literature was consulted on this last subset of genes, revealing the occurrence of practically all of them with biological processes related to leukemia. At sight of these results, this study underlines the relevance of considering a new parameter which facilitates the identification of highly valid expressed genes for simultaneously discerning multiple types of leukemia.
dc.identifier.doi10.1371/journal.pone.0212127
dc.identifier.essn1932-6203
dc.identifier.pmcPMC6372182
dc.identifier.pmid30753220
dc.identifier.pubmedURLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372182/pdf
dc.identifier.unpaywallURLhttps://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0212127&type=printable
dc.identifier.urihttp://hdl.handle.net/10668/13552
dc.issue.number2
dc.journal.titlePloS one
dc.journal.titleabbreviationPLoS One
dc.language.isoen
dc.organizationIBS
dc.page.numbere0212127
dc.pubmedtypeJournal Article
dc.pubmedtypeResearch Support, Non-U.S. Gov't
dc.rightsAttribution 4.0 International
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.meshBiomarkers, Tumor
dc.subject.meshComputational Biology
dc.subject.meshGene Expression Profiling
dc.subject.meshHumans
dc.subject.meshLeukemia
dc.subject.meshMachine Learning
dc.subject.meshOligonucleotide Array Sequence Analysis
dc.subject.meshSequence Analysis, RNA
dc.titleLeukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level.
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number14
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PMC6372182.pdf
Size:
2.87 MB
Format:
Adobe Portable Document Format