The high throughput of modern NGS sequencers coupled with the huge sizes of genomes currently analysed, poses always higher algorithmic challenges to align short reads quickly and accurately against a reference sequence. A crucial, additional, requirement is that the data structures used should be light. The available modern solutions usually are a compromise between the mentioned constraints: in particular, indexes based on the Burrows-Wheeler transform offer reduced memory requirements at the price of lower sensitivity, while hash-based text indexes guarantee high sensitivity at the price of significant memory consumption.

Fast randomized approximate string matching with succinct hash data structures

POLICRITI, Alberto;PREZZA, Nicola
2015-01-01

Abstract

The high throughput of modern NGS sequencers coupled with the huge sizes of genomes currently analysed, poses always higher algorithmic challenges to align short reads quickly and accurately against a reference sequence. A crucial, additional, requirement is that the data structures used should be light. The available modern solutions usually are a compromise between the mentioned constraints: in particular, indexes based on the Burrows-Wheeler transform offer reduced memory requirements at the price of lower sensitivity, while hash-based text indexes guarantee high sensitivity at the price of significant memory consumption.
File in questo prodotto:
File Dimensione Formato  
official.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 884.76 kB
Formato Adobe PDF
884.76 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1068702
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 6
social impact