Introduction

This short vignette demonstrates a cross-platform analysis of DNA methylation data that involves methylation calls from both Infinium microarray and bisulfite sequencing experiments.

Example data set

For this tutorial we used data of primary prostate cancer samples and related cell lines from the EPIC array evaluation study (Pidslet et al., Genome Biology, 2016) publicly available at Gene Expression Omnibus (GEO)

We have loaded the data sets and make them available as RnBSet objects on our website:

Loading the individual data sets

Let us say, we have downloaded the ZIP files with RnBeads objects into a directory:

DIR_DATASETS<-"~/cross_platform/datasets"

Let us now load each individual RnBSet object:

PLATFORMS<-c("450k", "EPIC", "WGBS")

rnb.sets<-list()
for(pl in PLATFORMS){
    
    dfile<-sprintf("dataset-%s.zip", pl)
    
    rnb.sets[[pl]]<-load.rnb.set(file.path(DIR_DATASETS, dfile))
    
}

Combining the RnBSet objects

Method combine() has now been adapted to allow the merger of any two RnBeads data container objects. It has an important parameter type that specifies the set operation for merging the methylation sites:

Let us combine all data sets and perform a sequence of joint cross-platform analyses of all data. First combine the two array-base data sets:

#rnb.set.combined<-Reduce("combine", rnb.sets)
rnb.set.arrays<-combine(rnb.sets[["450k"]], rnb.sets[["EPIC"]], type="common")

Then combine the resulting object with that of the WGBS data set:

rnb.set.combined<-combine(rnb.set.arrays,rnb.sets[["WGBS"]], type="common")

Now we add a column with information about the platform that was used to profile each sample:

rnb.set.combined<-addPheno(
  rnb.set.combined, unlist(sapply(names(rnb.sets), function(nn) rep(nn, length(samples(rnb.sets[[nn]]))))), 
  "Platform")

Next, adjust the column with sample identifiers and tell RnBeads to use it:

rnb.set.combined<-addPheno(rnb.set.combined, paste(pheno(rnb.set.combined)[["title"]], pheno(rnb.set.combined)[["Platform"]], sep=" "), "SampleID")

rnb.options(identifiers.column="SampleID")

Finally, save the resulting object for the later use:

save.rnb.set(rnb.set.combined, path=file.path(DIR_DATASETS,"dataset-all"))

Joint analysis of the data

One can run the complete exploratory analysis module on the combined data set:

rnb.run.exploratory(rnb.set.combined, dir.report="~/cross_platform/exploratory/")

Further, we can perform differential methylation analysis for several target characteristics. First adjust several options:

rnb.options(differential.enrichment.go=TRUE, differential.enrichment.lola=FALSE)

First, a simple analysis to compare samples of different platforms and types:

rnb.options(differential.comparison.columns=c("Platform","tissue"))
res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_prec")

Next, we create a new column of the sample annotation to define the biological replicates of the primary prostate cancer samples:

prec_comb<-rep(NA, nrow(pheno(rnb.set.combined)))
prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) & as.logical(seq(32)%% 2) ] <- "Rep1"
prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) & !as.logical(seq(32)%% 2) ] <- "Rep2"
rnb.set.combined<-addPheno(rnb.set.combined, prec_comb, "PrEC_comparison")

We use the obtained annotation column for another DMR search:

rnb.options(differential.comparison.columns=c("PrEC_comparison"))
res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_prec")

Similarly, we can compare LNCaP and PrEC:

lncap_prec_comb<-rep(NA, nrow(pheno(rnb.set.combined)))
lncap_prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) ] <- "LNCaP"
lncap_prec_comb[grepl("PrEC", pheno(rnb.set.combined)[["tissue"]]) ] <- "PrEC"

rnb.set.combined<-addPheno(rnb.set.combined, lncap_prec_comb, "LNCaP_PrEC_comp")
rnb.options(differential.comparison.columns=c("LNCaP_PrEC_comp"))

res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_lncap_prec")

To directly demonstrate consistency between measurements done with different platforms, we extract the resutls of the differential analysis report,

df2p<-read.csv("~/cross_platform/differential_prec/differential_methylation_data/diffMethTable_site_cmp1.csv")
df2p$isDMP<-df2p$combinedRank < 896

and create a new density scatter plot to compare the two replicates:

pp <- create.densityScatter(df2p[,c("mean.Rep1","mean.Rep2")],
           is.special=df2p$isDMP, dens.subsample=TRUE, sparse.points=0.01, add.text.cor=TRUE) +
labs(x=paste("mean.beta","PrEC, Rep1",sep=", "),y=paste("mean.beta","PrEC, Rep2",sep=", ")) + 
coord_fixed()

figName <- "~/cross_platform/differential_prec/diffMeth_sites"
report.plot <- createReportGgPlot(pp,figName, NULL,create.pdf=TRUE,high.png=200)
report.plot <- off(report.plot,handle.errors=TRUE)

pdf(paste0(figName, ".pdf"))
print(pp + ggpubr:::theme_pubr())
dev.off()

Conclusion

RnBeads enables a combined analysis of DNA methylation data from different genome scale assays. Exploratory analysis using PCA showed that platform-specific differences were not large enough to mask biologically meaningful variation.