Publication:
Ultra-fast genome comparison for large-scale genomic experiments.

dc.contributor.authorPérez-Wohlfeil, Esteban
dc.contributor.authorDiaz-Del-Pino, Sergio
dc.contributor.authorTrelles, Oswaldo
dc.date.accessioned2023-01-25T13:36:44Z
dc.date.available2023-01-25T13:36:44Z
dc.date.issued2019-07-16
dc.description.abstractIn the last decade, a technological shift in the bioinformatics field has occurred: larger genomes can now be sequenced quickly and cost effectively, resulting in the computational need to efficiently compare large and abundant sequences. Furthermore, detecting conserved similarities across large collections of genomes remains a problem. The size of chromosomes, along with the substantial amount of noise and number of repeats found in DNA sequences (particularly in mammals and plants), leads to a scenario where executing and waiting for complete outputs is both time and resource consuming. Filtering steps, manual examination and annotation, very long execution times and a high demand for computational resources represent a few of the many difficulties faced in large genome comparisons. In this work, we provide a method designed for comparisons of considerable amounts of very long sequences that employs a heuristic algorithm capable of separating noise and repeats from conserved fragments in pairwise genomic comparisons. We provide software implementation that computes in linear time using one core as a minimum and a small, constant memory footprint. The method produces both a previsualization of the comparison and a collection of indices to drastically reduce computational complexity when performing exhaustive comparisons. Last, the method scores the comparison to automate classification of sequences and produces a list of detected synteny blocks to enable new evolutionary studies.
dc.identifier.doi10.1038/s41598-019-46773-w
dc.identifier.essn2045-2322
dc.identifier.pmcPMC6635410
dc.identifier.pmid31312019
dc.identifier.pubmedURLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6635410/pdf
dc.identifier.unpaywallURLhttps://www.nature.com/articles/s41598-019-46773-w.pdf
dc.identifier.urihttp://hdl.handle.net/10668/14245
dc.issue.number1
dc.journal.titleScientific reports
dc.journal.titleabbreviationSci Rep
dc.language.isoen
dc.organizationInstituto de Investigación Biomédica de Málaga-IBIMA
dc.page.number10274
dc.pubmedtypeJournal Article
dc.pubmedtypeResearch Support, Non-U.S. Gov't
dc.rightsAttribution 4.0 International
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.meshAlgorithms
dc.subject.meshAnimals
dc.subject.meshBiological Evolution
dc.subject.meshData Visualization
dc.subject.meshGenome
dc.subject.meshGenomics
dc.subject.meshHumans
dc.subject.meshMammals
dc.subject.meshMice
dc.subject.meshPoaceae
dc.subject.meshSoftware
dc.subject.meshSynteny
dc.subject.meshTime Factors
dc.subject.meshTriticum
dc.titleUltra-fast genome comparison for large-scale genomic experiments.
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number9
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PMC6635410.pdf
Size:
2.96 MB
Format:
Adobe Portable Document Format