Exemplary multiplex bisulfite amplicon data used to demonstrate the utility of Methpat

Wong NC, Pope BJ, Candiloro I, Korbie D, Trau M, Wong SQ, Mikeska T, Van Denderen BJW, Thompson EW, Eggers S, Doyle SR, Dobrovic A, GigaScience (2015).

Abstract

Background DNA methylation is a complex epigenetic marker that can be analyzed using a wide variety of methods. Interpretation and visualization of DNA methylation data can mask complexity in terms of methylation status at each CpG site, cellular heterogeneity of samples and allelic DNA methylation patterns within a given DNA strand. Bisulfite sequencing is considered the gold standard, but visualization of massively parallel sequencing results remains a significant challenge.

Findings We created a program called Methpat that facilitates visualization and interpretation of bisulfite sequencing data generated by massively parallel sequencing. To demonstrate this, we performed multiplex PCR that targeted 48 regions of interest across 86 human samples. The regions selected included known gene promoters associated with cancer, repetitive elements, known imprinted regions and mitochondrial genomic sequences. We interrogated a range of samples including human cell lines, primary tumours and primary tissue samples. Methpat generates two forms of output: a tab-delimited text file for each sample that summarizes DNA methylation patterns and their read counts for each amplicon, and a HTML file that summarizes this data visually. Methpat can be used with publicly available whole genome bisulfite sequencing and reduced representation bisulfite sequencing datasets with sufficient read depths.

Conclusions Using Methpat, complex DNA methylation data derived from massively parallel sequencing can be summarized and visualized for biological interpretation. By accounting for allelic DNA methylation states and their abundance in a sample, Methpat can unmask the complexity of DNA methylation and yield further biological insight in existing datasets.

Data availability

Project name: Methpat

Project home page: http://bjpop.github.io/methpat/ Operating system(s): any POSIX-like operating system (i.e.: Linux, OS X)

Programming language: Python 2.7, HTML and Javascript

Other requirements: Web Browser to view visualization output (HTML file). Suggested browsers include Firefox, Chrome or Safari. Methpat requires output files derived by Bismark (http://www.bioinformatics.babraham.ac.uk/projects/bismark/) and the Bismark_methylation_extractor command. Methpat can be accessed directly from http://bjpop.github.io/methpat/. With further instructions found at the URL.

License: 3-clause BSD License

Any restrictions to use by non-academics: None

A flow diagram of analytical requirements and files can be found in Fig. 1.

BAM files, bismark_methylation_extractor output files and Methpat output files for each sample analyzed in this paper are available in the GigaScience GigaDB repository [12].