An Analysis of k-Mer Frequency Features with Machine Learning Models for Viral Subtyping of Polyomavirus and HIV-1 Genomes

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva


Viral subtyping is the process of classifying a virus genome into a subtype inside its family. Moreover, it plays a major role in the appropriate diagnosis and treatment of illness. In this context, researches use alignment-based methods to process viral subtyping classification. Nevertheless, alignment-based methods are slow and we need to expose the privacy of the sample genome consulted. For that reason, some methods have emerged, they use machine learning models that take the viral sample genome and predict the virus subtyping. Additionally, the performance of machine learning models depends on the feature vector computed, the most remarkable methods are based on k-mer frequency as features. In this study, we compared the two most relevant methods based on k-mer frequency, Kameris, and Castor-KRFE on a dataset of Polyomavirus and HIV-1 genomes. Both have the same results when we avoid their dimensionality reduction and feature elimination, but when not, Kameris slightly outperform Castor-KRFE. Moreover, Castor-KRFE could get a small feature vector for k> 5 (in k-mer).

Idioma originalInglés
Título de la publicación alojadaProceedings of the Future Technologies Conference, FTC 2020, Volume 1
EditoresKohei Arai, Supriya Kapoor, Rahul Bhatia
EditorialSpringer Science and Business Media Deutschland GmbH
Número de páginas12
ISBN (versión impresa)9783030631277
EstadoPublicada - 2021
Publicado de forma externa
EventoFuture Technologies Conference, FTC 2020 - San Francisco, Estados Unidos
Duración: 5 nov. 20206 nov. 2020

Serie de la publicación

NombreAdvances in Intelligent Systems and Computing
ISSN (versión impresa)2194-5357
ISSN (versión digital)2194-5365


ConferenciaFuture Technologies Conference, FTC 2020
País/TerritorioEstados Unidos
CiudadSan Francisco

Nota bibliográfica

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.


Profundice en los temas de investigación de 'An Analysis of k-Mer Frequency Features with Machine Learning Models for Viral Subtyping of Polyomavirus and HIV-1 Genomes'. En conjunto forman una huella única.

Citar esto