Optimization of blast seed indexing in the alignment of DNA sequences with GPU using CUDA

Franklin Luis Antonio Cruz Gamero, Juan Carlos Gutierrez Caceres

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the alignment of biological sequences such as DNA, RNA and proteins, different algorithms are used, mainly the Basic Local Alignment Search Tool (BLAST), which has two phases, a heuristic phase of seed indexing and another extension phase with a comparison of sequences using the Smith-Waterman (SW) algorithm, which allows the alignment of a short sequence 'query' with a long reference sequence 'db' in a very fast way in relation to other algorithms of alignment. This work proposes to use a two-dimensional matrix instead of a sparse matrix as a hash table for the storage of the seed index obtained, as well as the use of the GPU of our graphic card to optimize the planting, it reduces 11.24 % of the time of processing of seed indexing phase of the BLAST, presenting the use of GPU with CUDA a better performance in processing time than the sequential implementation and another multi CPUs using threads with OPENMP. Our algorithm has a complexity in time of O(1) to obtain the seeds identical to the pattern key. The performance is greater when the length of the hash key increases. For its evaluation tests we used a laptop core i7 of 16gb of RAM and a graphic card of 384 cores with C++ programming language and CUDA. Alignment tests were performed using real DNA sequences obtained from the National Center for Biotechnology Information (NCBI) and ENSEMBL in FASTA format with reference sequences of up to 1.3 Gb, such as the complete genome of the hen (Gallus gallus) that has 1 230 258 557 base pairs (bp) and with a query sequence of 140 bp, which was indexed with a 5 bp key in 1074 milliseconds using GPU.

Original languageEnglish
Title of host publicationProceedings - 2018 44th Latin American Computing Conference, CLEI 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages527-532
Number of pages6
ISBN (Electronic)9781728104379
DOIs
StatePublished - Oct 2018
Event44th Latin American Computing Conference, CLEI 2018 - Sao Paulo, Brazil
Duration: 1 Oct 20185 Oct 2018

Publication series

NameProceedings - 2018 44th Latin American Computing Conference, CLEI 2018

Conference

Conference44th Latin American Computing Conference, CLEI 2018
Country/TerritoryBrazil
CitySao Paulo
Period1/10/185/10/18

Bibliographical note

Publisher Copyright:
© 2018 IEEE.

Keywords

  • Alignment
  • BLAST
  • CUDA
  • DNA
  • GPU
  • Optimization

Fingerprint

Dive into the research topics of 'Optimization of blast seed indexing in the alignment of DNA sequences with GPU using CUDA'. Together they form a unique fingerprint.

Cite this