Show simple item record

dc.rights.licenseAl consultar y hacer uso de este recurso, está aceptando las condiciones de uso establecidas por los autores.es_CO
dc.contributor.advisorReyes Muñoz, Alejandro 
dc.contributor.authorLuque y Guzmán Sáenz, Guillermo Gustavo
dc.date.accessioned2018-09-28T10:49:39Z
dc.date.available2018-09-28T10:49:39Z
dc.date.issued2016
dc.identifier.urihttp://hdl.handle.net/1992/13692
dc.descriptionilustraciones a colores_CO
dc.descriptionIncluye referencias bibliográficases_CO
dc.descriptiontextoes_CO
dc.descriptioncomputadoraes_CO
dc.descriptionrecurso en líneaes_CO
dc.description.abstractWe introduce TAXOFOR a novel machine learning classifier using Random Forests to assign taxonomy to paired-end sequencing amplicons up to genus level, trained with annotated sequences from the Green-Genes database. It performs this task with a confidence close to 98% in terms of its accuracy, and it is faster than several of the de facto tools with the same purpose in microbial ecology. In order to manage the DNA sequences, at first they are numerically represented as projections into a 3D space defined by the vertex of a tetrahedron. Afterwards, Discrete Fourier Transform allows to get their Power Spectra and use them as input both to train the classifier and to predict their taxonomy. Parseval's identity theorem ensures that similarity between the numerical representation of two DNA sequences can be gotten from their power spectra. This aspect is tested by comparing a dendrogram showing the results of a hierarchical clustering using the pair-wise distance between the spectra of DNA sequences, with another one that has been built using the distance matrix obtained after a multiple sequence alignment (MSA). Performance and assertiveness of TAXOFOR against UCLUST, RDP and MOTHUR was assessed while assigning taxonomy to the same set of 16S rRNA sequences. The initial results are promising and give us enough room to implement improvements in terms of parallel processing and memory handlinges_CO
dc.formatapplication/pdfes_CO
dc.format.extent44 hojases_CO
dc.language.isoenges_CO
dc.sourceinstname:Universidad de los Andeses_CO
dc.sourcereponame:Repositorio Institucional Sénecaes_CO
dc.titleTaxonomic assignment of 16S rRNA sequences based on Fourier analysises_CO
dc.typemasterThesises_CO
dc.publisher.programMaestría en Biología Computacionales_CO
dc.rights.accessRightsopenAccess
dc.subject.keywordMetagenómica - Investigacioneses_CO
dc.subject.keywordADN ribosómico - Investigacioneses_CO
dc.subject.keywordBiología computacional - Investigacioneses_CO
dc.creator.degreeTesis (Magister en Biología Computacional) -- Universidad de los Andeses_CO
dc.publisher.facultyFacultad de Ingenieríaes_CO
dc.publisher.departmentDepartamento de Ingeniería de Sistemas y Computaciónes_CO
dc.type.versionpublishedVersion


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record