Cytosine methylation is a DNA modification that has great impact on the regulation of gene expression and important implications for the biology and health of several living beings, including humans. Bisulfite conversion followed by next generation sequencing (BS-seq) of DNA is the gold standard technique used to detect DNA methylation at single-base resolution on a genome scale through the identification of 5-methylcytosine (5-mC). However, by converting unmethylated cytosines into thymines, BS-seq poses computational challenges to read alignment and aggravates the issue of multiple hits due to the ambiguity raised by the reduced sequence complexity. Here we present ERNE-BS5 (Extended Randomized Numerical alignEr - BiSulfite 5), an aligning program developed to efficiently map BS-treated reads against large genomes (e.g., human). To achieve this goal we have implemented three different ideas: (i) we use a 5-letters alphabet for storing methylation information, (ii) we use a weighted context-aware Hamming distance to identify a T coming from an unmethylated C context, and (iii) we use an iterative process to position multiple-hit reads starting from a preliminary map built using single-hit alignments. The map is corrected and extended at each cycle using the alignments added in the previous iteration. ERNE-BS5 is based on a new improved version of the rNA [20] aligning software with a more efficient core. ERNE (Extended Randomized Numerical alignEr) is a short string alignment package whose goal is to provide an all-inclusive set of tools to handle short reads. ERNE comprises: ERNE-MAP, ERNE-DMAP, ERNE-FILTER, ERNE-VISUAL, and, from now on, ERNE-BS5. ERNE is free software and distributed with an Open Source License (GPL V3) and can be downloaded at: http://erne.sourceforge.net

ERNE-BS5: Aligning BS-treated sequences by multiple hits on a 5-letters alphabet

PREZZA, Nicola;DEL FABBRO, Cristian;DE PAOLI, Emanuele;POLICRITI, Alberto
2012

Abstract

Cytosine methylation is a DNA modification that has great impact on the regulation of gene expression and important implications for the biology and health of several living beings, including humans. Bisulfite conversion followed by next generation sequencing (BS-seq) of DNA is the gold standard technique used to detect DNA methylation at single-base resolution on a genome scale through the identification of 5-methylcytosine (5-mC). However, by converting unmethylated cytosines into thymines, BS-seq poses computational challenges to read alignment and aggravates the issue of multiple hits due to the ambiguity raised by the reduced sequence complexity. Here we present ERNE-BS5 (Extended Randomized Numerical alignEr - BiSulfite 5), an aligning program developed to efficiently map BS-treated reads against large genomes (e.g., human). To achieve this goal we have implemented three different ideas: (i) we use a 5-letters alphabet for storing methylation information, (ii) we use a weighted context-aware Hamming distance to identify a T coming from an unmethylated C context, and (iii) we use an iterative process to position multiple-hit reads starting from a preliminary map built using single-hit alignments. The map is corrected and extended at each cycle using the alignments added in the previous iteration. ERNE-BS5 is based on a new improved version of the rNA [20] aligning software with a more efficient core. ERNE (Extended Randomized Numerical alignEr) is a short string alignment package whose goal is to provide an all-inclusive set of tools to handle short reads. ERNE comprises: ERNE-MAP, ERNE-DMAP, ERNE-FILTER, ERNE-VISUAL, and, from now on, ERNE-BS5. ERNE is free software and distributed with an Open Source License (GPL V3) and can be downloaded at: http://erne.sourceforge.net
9781450316705
File in questo prodotto:
File Dimensione Formato  
p12-prezza.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Non pubblico
Dimensione 401.63 kB
Formato Adobe PDF
401.63 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11390/991346
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact