Methylome Resource

The Methylome Resource was established by applying RnBeads to some of the largest public reference datasets that are currently available for whole genome bisulfite sequencing (WGBS), for reduced representation bisulfite sequencing (RRBS) and for the Illumina Infinium HumanMethylation450 assay. This resource provides a reference for large-scale DNA methylation analyses that can be used in complementary ways:

  • Researchers can browse the reports online, explore biological hypotheses and load relevant data points for visual inspection or custom data analysis into R or into other software tools. For instance, using the links from the "Tracks and Tables" reports, the tracks can be visualized in various Genome Browsers.
    ↳ To explore the Methylome Resource, please click any of the "View analysis report" links below.
  • Researchers can download the data and configuration files, add their own DNA methylation data and then run RnBeads in order to analyze their data in the context of methylome datasets that span a broad set of tissue types.
    ↳ To rerun the Methylome Resource analyses, please download the data and configuration files from the table below. Each dataset can either run in full or using a representative subset of samples to reduce runtime. A more detailed explanation on how to run these analyses is available on the FAQ page.

Resource Data Source Data Archive Sample Annotation Files RnBeads Configuration
Genome-scale RRBS data for
216 tissues and cell lines
Encode Project Website data.zip (3 GB) samples.csv (all samples) analysis.xml
samples.csv (17 untreated samples)
Genome-wide WGBS data for
41 tissues and cell lines
Gene Expression Omnibus data.zip (10 GB) samples.csv (all 41 samples) analysis.xml
samples.csv (10 adult primary tissues)
Infinium 450k data for
4034 cancer and normal samples
TCGA data portal data.zip (34 GB) samples.csv (all samples) analysis.xml
samples.csv (40 samples from 10 primary tumors)
WGBS data for BLUEPRINT
methylomes (2015 release)
BLUEPRINT data portal data.zip (18 GB) samples.tsv (81 samples) analysis.xml
WGBS and NOMe-seq data for
T cell memory formation
DEEP data portal import_RnBSet.zip
(2 GB; RnBSet object)
samples.tsv (13 samples) analysis.xml
[Use case 1] Infinium 450k data for whole blood and sorted blood cells GEO
GEO
data.zip (6.3 GB) samples.csv (792 samples) analysis.xml
[Use case 2] WGBS data for BLUEPRINT
methylomes (2016 release)
BLUEPRINT data portal data.zip (45 GB) samples.tsv (195 samples) analysis.xml
[Use case 3] RRBS data for Ewing Sarcoma (188 samples) Gene Expression Omnibus data.zip (5.7 GB) samples.csv (188 samples) analysis.xml
preanalysis.R
[Use case 4] Cross-platform data set (450k, EPIC, WGBS) from an EPIC evaluation study GEO combined RnBSet
(136 MB)
analysis.xml

Resource 1: Genome-scale RRBS data for 216 tissues and cell lines

In the context of the ENCODE project, Varley et al. established genome-scale DNA methylation maps for various tissue samples and cell lines using reduced representation bisulfite sequencing (RRBS). This RnBeads analysis of 216 samples shows that cells from different germ layers are clearly distinguished by their DNA methylation profiles, and it identifies characteristic loci that can be used for classifying samples according to their tissue type. Including parts or all of this dataset in custom RnBeads analyses provides a useful reference for quality control, analysis and interpretation of user-generated DNA methylation datasets.


View analysis reports

Resource 2: Genome-wide WGBS data for 41 tissues and cell lines

Ziller et al. compiled whole genome bisulfite sequencing (WGBS) data for 41 tissues and cell lines comprising data from the Roadmap Epigenomics Project and other sources. The RnBeads analysis identified clear grouping of samples according to tissue types. Furthermore, sample type is strongly correlated with the laboratory performing the experiment. Outliers in terms of experimental quality as well as long term cell culturing are identified. This analysis illustrates how this methylome resource can be used for identifying both technical and biological outliers in large-scale DNA methylation datasets.


View analysis reports

Resource 3: Infinium 450k data for 4034 cancer samples

In the context of The Cancer Genome Atlas (TCGA) researchers have generated a large collection of cancer methylomes using the Infinium 450k assay. While an integrative analysis of the full dataset has not been published yet, Weisenberger recently summarized and reviewed the activities that led to creating this dataset. This RnBeads analysis of 4034 blood, breast, intestinal and brain cancer samples provides an extensive and interactively browsable analysis of this dataset, which can be used as a starting point for more targeted investigation and hypothesis testing, but also as a reference for interpreting DNA methylation aberrations observed in other cancer cohorts.


View analysis reports

Resource 4: Analysis of BLUEPRINT methylomes (WGBS, 2015 release)

In the context of the BLUEPRINT project, whole genome bisulfite sequencing data have been generated for 81 blood related samples. Among others primary monocyte and neutrophil cell samples from healthy donors were profiled. The RnBeads analysis shows similar methylomes between these two closely related cell types, but also significant DNA methylation differences enriched for important biological processes of the immune system.


View analysis reports

Resource 5: DNA methylation reprogramming in memory formation of human T cells (WGBS, NOMe-seq)

In the context of the DEEP project, whole genome bisulfite sequencing and NOMe-seq data have been generated for multiple stages during the process of memory formation in human CD4+ T cells (Durek et al., 2016). The data was anlyzed in RnBeads. The results show a progressive loss of DNA methylation during memory formation, particularly in putative regulatory regions of the genome. Two technologies (WGBS and NOMe-seq) were employed to assess CpG methylation levels. Overall, the agreement between both methods is high.


View analysis reports

Resource 6 [use case 1]: Infinium 450k data for 792 whole blood and sorted blood cells

Combined cohort of two studies: (1) 732 whole blood samples from an age study by Johansson et al., and (2) 60 samples from isolated cell types, peripheral blood and whole blood from cell type composition study by Reinius et al. The IDAT files and sample annotations are obtained from the Gene Expression Omnibus, data series GSE87571 and GSE35069.


View analysis reports

Resource 7 [use case 2]: Analysis of BLUEPRINT methylomes (WGBS, 2016 release)

The final release of the BLUEPRINT project contained whole genome bisulfite sequencing profiles for healthy and malignant blood cell types. The RnBeads analysis of 195 samples identified cell-type-specific variability in methylation patterns and indicated epigenetic similarity between cell types that share a common differentiation history.


View analysis reports

Resource 8 [use case 3]: DNA methylation variability associated with Ewing Sarcoma (RRBS, 188 samples)

This analysis focuses on DNA methylation associated with Ewing Sarcoma, a bone cancer primarily affecting children and young adults. In addition to Ewing tissue samples, healthy Mesenchymal Stem Cells (MSCs), MSCs affected with Ewing sarcoma and Ewing Cell Lines are part of the data set (Sheffield et al., 2017). Using RnBeads' differential methylation module, only few differences between the groups could be detected. However, higher methylation variability was detected in normal MSCs vs. MSCs affected with Ewing Sarcoma, contradicting the assumption that Ewing Sarcoma increases heterogeneity among individual samples.


View analysis reports

Resource 9 [use case 4]: Analysis of a cross-platform data set (450k, EPIC, WGBS) from an EPIC evaluation study

Pidsley et al. performed a critical evaluation of the EPIC array using a selection of of prostate cancer samples, healthy control tissues and prostate cell lines (GEO: GSE86831). They compared the results of profiling with EPIC array to the predecessor array version, Infinium 450k, as well as whole-genome methylomes of similar samples. Using the recently extended RnBSet merging functionality we processed all these data jointly in one RnBeads analysis. The results showed that, although there are clear technical differences between the methylation values obtained with the three platforms, biological variability could still break through the platform-specific biases. Combined differential methylation analysis in mixed groups of samples measured with different platforms often lead to meaningful functional annotation enrichments.


View analysis reports