Publication:
BLASSO: integration of biological knowledge into a regularized linear model.

dc.contributor.authorUrda, Daniel
dc.contributor.authorAragon, Francisco
dc.contributor.authorBautista, Rocio
dc.contributor.authorFranco, Leonardo
dc.contributor.authorVeredas, Francisco J
dc.contributor.authorClaros, Manuel Gonzalo
dc.contributor.authorJerez, Jose Manuel
dc.contributor.funderMINECO-SPAIN
dc.contributor.funderFEDER
dc.contributor.funderICE Andalucía TECH (Spain)
dc.date.accessioned2023-01-25T10:24:47Z
dc.date.available2023-01-25T10:24:47Z
dc.date.issued2018-11-20
dc.description.abstractIn RNA-Seq gene expression analysis, a genetic signature or biomarker is defined as a subset of genes that is probably involved in a given complex human trait and usually provide predictive capabilities for that trait. The discovery of new genetic signatures is challenging, as it entails the analysis of complex-nature information encoded at gene level. Moreover, biomarkers selection becomes unstable, since high correlation among the thousands of genes included in each sample usually exists, thus obtaining very low overlapping rates between the genetic signatures proposed by different authors. In this sense, this paper proposes BLASSO, a simple and highly interpretable linear model with l1-regularization that incorporates prior biological knowledge to the prediction of breast cancer outcomes. Two different approaches to integrate biological knowledge in BLASSO, Gene-specific and Gene-disease, are proposed to test their predictive performance and biomarker stability on a public RNA-Seq gene expression dataset for breast cancer. The relevance of the genetic signature for the model is inspected by a functional analysis. BLASSO has been compared with a baseline LASSO model. Using 10-fold cross-validation with 100 repetitions for models' assessment, average AUC values of 0.7 and 0.69 were obtained for the Gene-specific and the Gene-disease approaches, respectively. These efficacy rates outperform the average AUC of 0.65 obtained with the LASSO. With respect to the stability of the genetic signatures found, BLASSO outperformed the baseline model in terms of the robustness index (RI). The Gene-specific approach gave RI of 0.15±0.03, compared to RI of 0.09±0.03 given by LASSO, thus being 66% times more robust. The functional analysis performed to the genetic signature obtained with the Gene-disease approach showed a significant presence of genes related with cancer, as well as one gene (IFNK) and one pseudogene (PCNAP1) which a priori had not been described to be related with cancer. BLASSO has been shown as a good choice both in terms of predictive efficacy and biomarker stability, when compared to other similar approaches. Further functional analyses of the genetic signatures obtained with BLASSO has not only revealed genes with important roles in cancer, but also genes that should play an unknown or collateral role in the studied disease.
dc.description.sponsorshipThe authors acknowledge support through grants TIN2014-58516-C2-1-R and TIN2017-88728-C2 from MINECO-SPAIN which include FEDER funds. DU was supported by ICE Andalucía TECH (Spain) through a postdoctoral fellowship.
dc.description.versionSi
dc.identifier.citationUrda D, Aragón F, Bautista R, Franco L, Veredas FJ, Claros MG, et al. BLASSO: integration of biological knowledge into a regularized linear model. BMC Syst Biol. 2018 Nov 20;12(Suppl 5):94
dc.identifier.doi10.1186/s12918-018-0612-8
dc.identifier.essn1752-0509
dc.identifier.pmcPMC6245593
dc.identifier.pmid30458775
dc.identifier.pubmedURLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245593/pdf
dc.identifier.unpaywallURLhttps://doi.org/10.1186/s12918-018-0612-8
dc.identifier.urihttp://hdl.handle.net/10668/13214
dc.issue.numberSuppl 5
dc.journal.titleBMC systems biology
dc.journal.titleabbreviationBMC Syst Biol
dc.language.isoen
dc.organizationInstituto de Investigación Biomédica de Málaga-IBIMA
dc.page.number14
dc.provenanceRealizada la curación de contenido 22/08/2024
dc.publisherSpringer Nature
dc.pubmedtypeJournal Article
dc.pubmedtypeResearch Support, Non-U.S. Gov't
dc.pubmedtypeValidation Study
dc.relation.projectIDTIN2017-88728-C2
dc.relation.projectIDTIN2014-58516-C2-1-R
dc.relation.publisherversionhttps://bmcsystbiol.biomedcentral.com/articles/10.1186/s12918-018-0612-8
dc.rights.accessRightsopen access
dc.subjectBiological knowledge
dc.subjectBiomarkers selection
dc.subjectMachine learning
dc.subjectPrecision medicine
dc.subjectRNA-Seq
dc.subject.decsAnálisis de secuencia de ARN
dc.subject.decsBiomarcadores de tumor
dc.subject.decsMedicina de precisión
dc.subject.decsNeoplasias de la mama
dc.subject.decsPerfilación de la expresión génica
dc.subject.meshBiomarkers, tumor
dc.subject.meshBreast neoplasms
dc.subject.meshFemale
dc.subject.meshGene expression profiling
dc.subject.meshHumans
dc.subject.meshLinear models
dc.subject.meshMachine learning
dc.subject.meshPrecision medicine
dc.subject.meshSequence analysis, RNA
dc.titleBLASSO: integration of biological knowledge into a regularized linear model.
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number12
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PMC6245593.pdf
Size:
2.43 MB
Format:
Adobe Portable Document Format