This short vignette demonstrates a cross-platform analysis of DNA methylation data that involves methylation calls from both Infinium microarray and bisulfite sequencing experiments.
For this tutorial we used data of primary prostate cancer samples and related cell lines from the EPIC array evaluation study (Pidslet et al., Genome Biology, 2016) publicly available at Gene Expression Omnibus (GEO)
We have loaded the data sets and make them available as RnBSet objects on our website:
Let us say, we have downloaded the ZIP files with RnBeads objects into a directory:
DIR_DATASETS<-"~/cross_platform/datasets"
Let us now load each individual RnBSet object:
PLATFORMS<-c("450k", "EPIC", "WGBS")
rnb.sets<-list()
for(pl in PLATFORMS){
dfile<-sprintf("dataset-%s.zip", pl)
rnb.sets[[pl]]<-load.rnb.set(file.path(DIR_DATASETS, dfile))
}
Method combine()
has now been adapted to allow the merger of any two RnBeads data container objects. It has an important parameter type
that specifies the set operation for merging the methylation sites:
type="all"
will include all sites present in both objectstype="all.x"
will retain all sites present in the first objecttype="all.y"
will retain all sites present in the second objecttype="common"
will retain only sites found in both objectsLet us combine all data sets and perform a sequence of joint cross-platform analyses of all data. First combine the two array-base data sets:
#rnb.set.combined<-Reduce("combine", rnb.sets)
rnb.set.arrays<-combine(rnb.sets[["450k"]], rnb.sets[["EPIC"]], type="common")
Then combine the resulting object with that of the WGBS data set:
rnb.set.combined<-combine(rnb.set.arrays,rnb.sets[["WGBS"]], type="common")
Now we add a column with information about the platform that was used to profile each sample:
rnb.set.combined<-addPheno(
rnb.set.combined, unlist(sapply(names(rnb.sets), function(nn) rep(nn, length(samples(rnb.sets[[nn]]))))),
"Platform")
Next, adjust the column with sample identifiers and tell RnBeads to use it:
rnb.set.combined<-addPheno(rnb.set.combined, paste(pheno(rnb.set.combined)[["title"]], pheno(rnb.set.combined)[["Platform"]], sep=" "), "SampleID")
rnb.options(identifiers.column="SampleID")
Finally, save the resulting object for the later use:
save.rnb.set(rnb.set.combined, path=file.path(DIR_DATASETS,"dataset-all"))
One can run the complete exploratory analysis module on the combined data set:
rnb.run.exploratory(rnb.set.combined, dir.report="~/cross_platform/exploratory/")
Further, we can perform differential methylation analysis for several target characteristics. First adjust several options:
rnb.options(differential.enrichment.go=TRUE, differential.enrichment.lola=FALSE)
First, a simple analysis to compare samples of different platforms and types:
rnb.options(differential.comparison.columns=c("Platform","tissue"))
res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_prec")
Next, we create a new column of the sample annotation to define the biological replicates of the primary prostate cancer samples:
prec_comb<-rep(NA, nrow(pheno(rnb.set.combined)))
prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) & as.logical(seq(32)%% 2) ] <- "Rep1"
prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) & !as.logical(seq(32)%% 2) ] <- "Rep2"
rnb.set.combined<-addPheno(rnb.set.combined, prec_comb, "PrEC_comparison")
We use the obtained annotation column for another DMR search:
rnb.options(differential.comparison.columns=c("PrEC_comparison"))
res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_prec")
Similarly, we can compare LNCaP and PrEC:
lncap_prec_comb<-rep(NA, nrow(pheno(rnb.set.combined)))
lncap_prec_comb[grepl("LNCaP", pheno(rnb.set.combined)[["tissue"]]) ] <- "LNCaP"
lncap_prec_comb[grepl("PrEC", pheno(rnb.set.combined)[["tissue"]]) ] <- "PrEC"
rnb.set.combined<-addPheno(rnb.set.combined, lncap_prec_comb, "LNCaP_PrEC_comp")
rnb.options(differential.comparison.columns=c("LNCaP_PrEC_comp"))
res<-rnb.run.differential(rnb.set.combined, dir.report="~/cross_platform/differential_lncap_prec")
To directly demonstrate consistency between measurements done with different platforms, we extract the resutls of the differential analysis report,
df2p<-read.csv("~/cross_platform/differential_prec/differential_methylation_data/diffMethTable_site_cmp1.csv")
df2p$isDMP<-df2p$combinedRank < 896
and create a new density scatter plot to compare the two replicates:
pp <- create.densityScatter(df2p[,c("mean.Rep1","mean.Rep2")],
is.special=df2p$isDMP, dens.subsample=TRUE, sparse.points=0.01, add.text.cor=TRUE) +
labs(x=paste("mean.beta","PrEC, Rep1",sep=", "),y=paste("mean.beta","PrEC, Rep2",sep=", ")) +
coord_fixed()
figName <- "~/cross_platform/differential_prec/diffMeth_sites"
report.plot <- createReportGgPlot(pp,figName, NULL,create.pdf=TRUE,high.png=200)
report.plot <- off(report.plot,handle.errors=TRUE)
pdf(paste0(figName, ".pdf"))
print(pp + ggpubr:::theme_pubr())
dev.off()
RnBeads enables a combined analysis of DNA methylation data from different genome scale assays. Exploratory analysis using PCA showed that platform-specific differences were not large enough to mask biologically meaningful variation.