Publication:
Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.

dc.contributor.authorCarbonell-Caballero, José
dc.contributor.authorAmadoz, Alicia
dc.contributor.authorAlonso, Roberto
dc.contributor.authorHidalgo, Marta R
dc.contributor.authorÇubuk, Cankut
dc.contributor.authorConesa, David
dc.contributor.authorLópez-Quílez, Antonio
dc.contributor.authorDopazo, Joaquín
dc.date.accessioned2023-01-25T10:00:41Z
dc.date.available2023-01-25T10:00:41Z
dc.date.issued2017
dc.description.abstractCurrent plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions of the genome. The reliability of our protocol has been extensively tested through different experiments and organisms with accurate results, improving state-of-the-art methods. Our analysis demonstrates synergistic relations between quality control scores and allelic variability estimators, that improve the detection of misassembled regions, and is able to find strong artifact signals even within the human reference assembly. Furthermore, we demonstrated how our model can be trained to properly rank the confidence of a set of candidate variants obtained from new independent samples. This tool is freely available at http://gitlab.com/carbonell/ces. jcarbonell.cipf@gmail.com or joaquin.dopazo@juntadeandalucia.es. Supplementary data are available at Bioinformatics online.
dc.identifier.doi10.1093/bioinformatics/btx482
dc.identifier.essn1367-4811
dc.identifier.pmcPMC5870781
dc.identifier.pmid28961772
dc.identifier.pubmedURLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870781/pdf
dc.identifier.unpaywallURLhttps://academic.oup.com/bioinformatics/article-pdf/33/22/3511/25167564/btx482.pdf
dc.identifier.urihttp://hdl.handle.net/10668/11626
dc.issue.number22
dc.journal.titleBioinformatics (Oxford, England)
dc.journal.titleabbreviationBioinformatics
dc.language.isoen
dc.organizationFundación Pública Andaluz Progreso y Salud-FPS
dc.organizationHospital Universitario Virgen del Rocío
dc.page.number3511-3517
dc.pubmedtypeJournal Article
dc.rightsAttribution-NonCommercial 4.0 International
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subject.meshAnimals
dc.subject.meshGenetic Variation
dc.subject.meshGenome
dc.subject.meshGenomics
dc.subject.meshGenotype
dc.subject.meshHumans
dc.subject.meshModels, Statistical
dc.subject.meshQuality Control
dc.subject.meshReproducibility of Results
dc.subject.meshSoftware
dc.titleReference genome assessment from a population scale perspective: an accurate profile of variability and noise.
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number33
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PMC5870781.pdf
Size:
270.56 KB
Format:
Adobe Portable Document Format