This document explains the functionalities available in the daVis package.
daVis contains utility functions to visualize the output from
differential expression analysis. The input data can be a model, a
list of top tables, or a combination of these two. The model can be of class
MArrayLM (limma), DGELRT (edgeR), or DESeqResults (DESeq2).
Download the package from Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("daVis")
library(daVis)
Each visualization function takes as an input:
MArrayLM, DGELRT or DESeqResultstmpDir <- tempfile(); dir.create(tmpDir)
exampleData <- createExampleData(
path = tmpDir,
output = c("limma", "edgeR", "deseq2", "topTable")
)
## calcNormFactors has been renamed to normLibSizes
res.limma <- exampleData$limma
res.edger <- exampleData$edgeR
res.deseq <- exampleData$DESeq2
topTableList <- exampleData$topTable
The user should provide a named list of top tables for different contrasts/coefficients. Here, the list contains 4 top tables from 4 comparisons. Each top table must contain the following columns: 'logFC', 'P.Value', 'adj.P.Val'. Additionally, each table can contain columns with gene identifiers and averaged across all samples expression ('AveExpr' or 'logCPM').
length(topTableList)
## [1] 4
The names of the list are linked to the comparisons:
names(topTableList)
## [1] "B.LvsP" "L.LvsP" "B.PvsV" "L.PvsV"
Below a subset of an example top table:
| ENTREZID | SYMBOL | logFC | AveExpr | P.Value | adj.P.Val |
|---|---|---|---|---|---|
| 24117 | Wif1 | -1.82 | 2.975 | 1.279e-10 | 9.387e-07 |
| 381290 | Atp2b4 | 2.144 | 3.944 | 1.852e-10 | 9.387e-07 |
| 226101 | Myof | 2.33 | 6.223 | 2.827e-10 | 9.387e-07 |
| 78896 | Ecrg4 | -2.807 | 3.036 | 3.011e-10 | 9.387e-07 |
| 231830 | Micall2 | -2.253 | 4.761 | 3.527e-10 | 9.387e-07 |
| 16012 | Igfbp6 | 2.896 | 1.978 | 2.931e-10 | 9.387e-07 |
The object res.limma is of class:
class(res.limma)
## [1] "MArrayLM"
## attr(,"package")
## [1] "limma"
The object res.edger is of class:
class(res.edger)
## [1] "DGELRT"
## attr(,"package")
## [1] "edgeR"
The object res.deseq is of class:
class(res.deseq)
## [1] "DESeqResults"
## attr(,"package")
## [1] "DESeq2"
A volcano plot enables a quick visual identification of the size and significance of the feature expression effects (top left/right). The significance of the effect is represented by the raw p-value on the y axis, so highly significant features are at the top of the plot. The size of the effect is represented by the log of the fold change (negative/positive for down/up-regulation), so features with high effects are at the right/left side of the plot. Below, the color scale is used for adjusted p-values (corrected for multiple testing across genes).
The documentation of the function containing description of each parameter can be obtained by:
help("daVolcanoPlot", "daVis")
The function daVolcanoPlot() creates a volcano plot for the provided
model or list of top tables, or their combination.
The feature identifier can be specified by
featuresIdVar parameter. If empty (by default), row names of the input
are used as feature identifiers.
The colorVar, shapeVar, alphaVar and/or sizeVar
parameters can be used to customize the plot. Here, the colorVar
parameter is used to color the points by adjusted p-value.
The topGenes parameter represents the number of top genes with highest
logFC or p-value to highlight in the plot for each considered coefficient
(0 by default). The features are then labeled by topGenesVar parameter.
If empty, row names of the input are used.
Additionally, a set of genes of interest can be highlighted in red.
The features can be specified using the genesToHighlight parameter and
genesToHighlightVar indicates the identifier used to label
genes of interest. genesToHighlight should be the same as the input data
row names.
coefs <- c("B.LvsP", "L.LvsP")
genesOfInterest <- c("497097", "20671", "239273", "14862", "27395", "76408")
daVolcanoPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B"),
topGenes = 5,
topGenesVar = "SYMBOL",
genesToHighlight = genesOfInterest,
genesToHighlightVar = "SYMBOL",
colorVar = "adj.P.Val",
facetNCol = 2
)
## Loading required namespace: ggrepel
coefs <- c("B.LvsP", "L.LvsP")
daVolcanoPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B"),
facetNCol = 2,
additionalThresholdsAdjPValue = 0.1,
colorVar = "adj.P.Val"
)
The interactive volcano plot will be created by changing the typePlot
parameter.
coefs <- c("B.LvsP", "L.LvsP")
daVolcanoPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B"),
typePlot = "interactive"
)
A log-ratio plot represents the differential effect (e.g., treatment versus control) for several conditions (e.g., compounds or concentrations) of the experiment (logFC scale). This enables to visualize a bigger subset of genes. The significance of genes can be represented via colored rows, e.g., red denotes significant genes, while grey indicates non-significant genes.
The documentation of the function containing description of each parameter can be obtained by:
help("daLogRatioPlot", "daVis")
The function daLogRatioPlot() creates a log-ratio plot for the provided
model, top tables
(or list of those). The features should be specified using the features
parameter, and the feature identifier can be specified using the
featuresIdVar parameter. If the features parameter is not provided, the
plot will display the top 20 features. The features labels can be colored by
using featuresColor parameter.
Here, the features are colored based on the significance for the
coefficient. Additionally, the features can be labeled by providing the
featuresVar parameter indicating the column names in the top table.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
features <- c(
"72243", "66704", "11781", "226101", "14620", "381290", "16012",
"100504225", "231830", "78896", "103889", "231991", "77034", "19687",
"100043805", "16669", "226162", "24117", "55987", "11419"
)
dt <- topTableList[['B.LvsP']]
adjPVal <- dt[match(features, dt$ENTREZID), "adj.P.Val"]
signFeature <- ifelse(adjPVal <= 0.05, "red", "grey")
daLogRatioPlot(
input = topTableList,
features = features,
featuresIdVar = "ENTREZID",
featuresVar = c("SYMBOL", "GENENAME"),
featuresMaxNChar = 35,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
facetNCol = 4,
featuresColor = signFeature,
errorBars = FALSE
)
The coefficients can be grouped by specifying multiple sets of labels to the
coefLabel parameter.
The colors of the bars can be changed via color.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
coefsLabel <- list(
sub("(.+)\\.(.+)", "\\2", coefs),
sub("(.+)\\.(.+)", "\\1", coefs)
)
colorPalette <- c(
`B.PvsV` = "darkgreen", `L.PvsV` = "lightgreen",
`B.LvsP` = "darkblue", `L.LvsP` = "lightblue"
)
daLogRatioPlot(
input = topTableList,
featuresIdVar = "ENTREZID",
coef = coefs,
coefLabel = coefsLabel,
facetNCol = 4,
errorBars = FALSE,
color = colorPalette
)
# Note: using coef labels as names of the color palette also works
# (if coefLabel is NOT a list) - for back-compatibility
# colorPalette <- c(
# C = "darkgreen", D = "lightgreen",
# A = "darkblue", B = "lightblue"
# )
The log ratio plot can be created for a (mixed) list of top table(s) and model(s).
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV", "A")
daLogRatioPlot(
input = list(res.limma, A = topTableList[["B.LvsP"]]),
featuresIdVar = "ENTREZID",
coef = coefs,
facetNCol = 5,
errorBars = TRUE
)
The features can be sorted with the featuresOrder parameter.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
daLogRatioPlot(
input = res.limma,
featuresIdVar = "ENTREZID",
features = features, featuresOrder = "significance",
coef = coefs,
facetNCol = 4,
errorBars = TRUE
)
A text can be displayed in the log ratio plot via the text parameter.
This can be a column of the top table or a function formatting such column(s).
If the text doesn’t fit within the axes limits, the x-axis can be expanded
via the xexpand parameter.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
# Note: the format.pval function can be used to format the p-value as a text
getSignifStar <- function(topTable){
with(topTable, as.character(
stats::symnum(
x = adj.P.Val,
cutpoints = c(0, .001, .01, .05, .1, 1),
symbols = c("***","**","*","."," "),
corr = FALSE
)
))
}
daLogRatioPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
color = colorPalette,
facetNCol = 4,
text = getSignifStar
)
daLogRatioPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
color = colorPalette,
facetNCol = 4,
text = function(topTable)
with(topTable, round(logFC, digits = 1)),
textCex = 3
)
A heatmap represents the differential effect (e.g. treatment versus control) for several conditions (e.g., compounds or concentrations) of the experiment. It is a graphical representation of the individual values contained in a matrix as colors. This enables to visualize a bigger subset of genes. The gene label can be colored indicating for example the significance of genes, e.g., black color denotes significant genes, while grey represents non-significant genes.
The documentation of the function containing description of each parameter can be obtained by:
help("daHeatmap", "daVis")
The function daHeatmapLogFC() creates a heatmap for the provided model or
list of top tables. The features should be specified using the features
parameter, and the feature identifier can be specified using featuresIdVar.
The features are labeled by row names of input (or featuresIdVar)
by default. Different labels are possible by using the featuresVar parameter.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
daHeatmapLogFC(
input = topTableList,
features = features,
featuresIdVar = "ENTREZID",
featuresVar = c("SYMBOL", "GENENAME"),
featuresMaxNChar = 35,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
featuresColor = signFeature
)
The coefficients can be grouped by specifying multiple sets of labels to the
coefLabel parameter.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
coefsLabel <- list(
sub("(.+)\\.(.+)", "\\2", coefs),
sub("(.+)\\.(.+)", "\\1", coefs)
)
daHeatmapLogFC(
input = res.limma,
features = features,
featuresIdVar = "ENTREZID",
featuresVar = c("SYMBOL", "GENENAME"),
featuresMaxNChar = 35,
coef = coefs,
coefLabel = coefsLabel
)
An upset plot is used to represent the overlap (intersection) or difference of the sets of significant genes, down or up-regulated separately, between different differential effects. The different shades of blue are indicative for the number of differential effects (sharing these up- or down-regulated genes).
The documentation of the function containing description of each parameter can be obtained by:
help("daUpset", "daVis")
The function daUpset() creates an upset plot for the provided model or
list of top tables. The sets are created based on the row names of input or
feature identifiers specified using the featuresIdVar parameter.
daUpset(
input = topTableList,
coef = coefs,
featuresIdVar = "SYMBOL",
fdr = 0.05,
dir = "down",
ylab = paste("Number of (shared) significantly \n",
"down-regulated genes"),
xlab = paste("Number of significantly \n",
"down-regulated genes")
)
If returnAnalysis is set to TRUE, the output of the daUpset()
function is a list. The slot sets contains
all the overlapping sets between specified coefficients. The sets contain
identifiers based on featuresIdVar. The slot plot contains the plot object.
out <- daUpset(
input = topTableList,
coef = coefs,
featuresIdVar = "SYMBOL",
fdr = 0.05,
dir = "down",
ylab = paste("Number of (shared) significantly \n",
"down-regulated genes"),
xlab = paste("Number of significantly \n",
"down-regulated genes"),
returnAnalysis = TRUE
)
out$sets
out$plot
A scatter plot visualizes the comparison of the logFC for different differential effects.
The documentation of the function containing description of each parameter can be obtained by:
help("daScatterPlot", "daVis")
The function daScatterPlot() creates a scatter plot for the provided
model or list of top tables.
The features of interest can be specified using genesToHighlight and
genesToHighlightVar parameters. Genes with highest logFC and significance
can be colored as well by topGenes. These can be labeled
by topGenesVar.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
genesOfInterest <- c("497097", "20671", "239273", "14862", "27395", "76408")
daScatterPlot(
input = topTableList,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
genesToHighlight = genesOfInterest,
genesToHighlightVar = "SYMBOL",
topGenes = 5,
topGenesVar = "SYMBOL",
facetNCol = 3
)
The interactive scatter plot will be created by changing the typePlot
parameter. An additional feature annotation can be shown when hovering over
the points. The featuresVar parameter allows to specify the column names
that should be used to annotate the features.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
daScatterPlot(
input = topTableList,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
featuresIdVar = "ENTREZID",
featuresVar = c("SYMBOL", "GENENAME"),
facetNCol = 3,
typePlot = "interactive"
)
The Pearson correlation value can be shown in the plot when setting
correlation parameter to TRUE.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
daScatterPlot(
input = topTableList,
coef = coefs[c(1, 2)],
coefLabel = c("A", "B"),
featuresIdVar = "ENTREZID",
correlation = TRUE
)
A waterfall plot visualizes the logFC for different differential effects colored by adjusted p-value.
The documentation of the function containing description of each parameter can be obtained by:
help("daWaterfallPlot", "daVis")
The function daWaterfallPlot() creates a barplot plot for the provided
model or list of top tables.
When more than one coefficient specified, multiple plots side by side are
created.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
features <- c(
"24117", "381290", "226101", "78896", "231830", "16012", "16669", "55987",
"231991", "14620", "20317", "74747", "11636", "20482", "194126", "270150",
"17131", "16878", "20564", "73847"
)
daWaterfallPlot(
input = res.limma,
coef = coefs,
coefLabel = c("A", "B", "C", "D"),
featuresIdVar = "ENTREZID",
featuresVar = "SYMBOL",
features = features,
facetNCol = 4,
colorVar = "adj.P.Val",
color = c("maroon", "orange"),
fillVar = "adj.P.Val",
fill = c("maroon", "orange"),
typePlot = "static"
)
A MA plot visualizes the logFC versus log2 mean expression.
The documentation of the function containing description of each parameter can be obtained by:
help("daMAplot", "daVis")
The function daMAplot() creates a scatter plot (logFC versus mean
expression) for the provided model or list of top tables.
The top genes with highest absolute logFC and significance can be labeled.
The number of top genes can be specified using the topGenes parameter.
Top genes are labeled by feature names by default. It can be changed with
topGenesVar.
The genes of interest are indicated in green. Additionally, if coef
indicates only one coefficient, the legend shows the number of significantly
up- or down-regulated genes, or the number of non-significant genes. The points
can be colored by direction (significant up- and down-regulated genes) if
direction is set to TRUE. Customized colors can be used with color.
coefs <- c("B.LvsP", "L.LvsP")
daMAplot(
input = res.limma,
coef = coefs[1],
coefLabel = "A",
featuresIdVar = "ENTREZID",
topGenes = 5,
topGenesVar = "SYMBOL",
genesToHighlight = genesOfInterest,
genesToHighlightVar = "SYMBOL",
direction = TRUE,
color = c("steelblue", "firebrick", "grey")
)
A barplot visualizes the number of significant genes per comparison.
The documentation of the function containing description of each parameter can be obtained by:
help("daSignificantGenesBarplot", "daVis")
The function daSignificantGenesBarplot() creates a barplot indicating the
number of significant (up- and down-regulated) genes for the provided model or
list of top tables. The coefficients can be grouped by specifying multiple sets
of labels to the coefLabel parameter.
coefs <- c("B.LvsP", "L.LvsP", "B.PvsV", "L.PvsV")
coefsLabel <- list(
sub("(.+)\\.(.+)", "\\2", coefs),
sub("(.+)\\.(.+)", "\\1", coefs)
)
daSignificantGenesBarplot(
input = res.limma,
coef = coefs,
coefLabel = coefsLabel,
addPercentage = TRUE
)
sessionInfo()
## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.24-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] pander_0.6.6 daVis_0.99.4 BiocStyle_2.41.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.2.1
## [3] farver_2.1.2 blob_1.3.0
## [5] Biostrings_2.81.3 S7_0.2.2
## [7] fastmap_1.2.0 digest_0.6.39
## [9] lifecycle_1.0.5 statmod_1.5.2
## [11] KEGGREST_1.53.0 RSQLite_3.53.1
## [13] magrittr_2.0.5 compiler_4.6.0
## [15] rlang_1.2.0 sass_0.4.10
## [17] tools_4.6.0 yaml_2.3.12
## [19] knitr_1.51 labeling_0.4.3
## [21] S4Arrays_1.13.0 bit_4.6.0
## [23] DelayedArray_0.39.3 xml2_1.5.2
## [25] plyr_1.8.9 RColorBrewer_1.1-3
## [27] abind_1.4-8 BiocParallel_1.47.0
## [29] withr_3.0.2 BiocGenerics_0.59.7
## [31] grid_4.6.0 ggh4x_0.3.1
## [33] stats4_4.6.0 edgeR_4.11.1
## [35] ggplot2_4.0.3 scales_1.4.0
## [37] tinytex_0.59 dichromat_2.0-0.1
## [39] SummarizedExperiment_1.43.0 cli_3.6.6
## [41] UpSetR_1.4.1 rmarkdown_2.31
## [43] crayon_1.5.3 generics_0.1.4
## [45] otel_0.2.0 httr_1.4.8
## [47] commonmark_2.0.0 DBI_1.3.0
## [49] cachem_1.1.0 legendry_0.3.0
## [51] stringr_1.6.0 parallel_4.6.0
## [53] AnnotationDbi_1.75.0 BiocManager_1.30.27
## [55] XVector_0.53.0 matrixStats_1.5.0
## [57] vctrs_0.7.3 Matrix_1.7-5
## [59] jsonlite_2.0.0 litedown_0.9
## [61] bookdown_0.46 IRanges_2.47.2
## [63] S4Vectors_0.51.3 bit64_4.8.2
## [65] ggrepel_0.9.8 magick_2.9.1
## [67] locfit_1.5-9.12 limma_3.69.2
## [69] jquerylib_0.1.4 glue_1.8.1
## [71] org.Mm.eg.db_3.23.0 codetools_0.2-20
## [73] ggtext_0.1.2 stringi_1.8.7
## [75] gtable_0.3.6 GenomicRanges_1.65.0
## [77] tibble_3.3.1 pillar_1.11.1
## [79] htmltools_0.5.9 Seqinfo_1.3.0
## [81] R6_2.6.1 evaluate_1.0.5
## [83] lattice_0.22-9 Biobase_2.73.1
## [85] markdown_2.0 png_0.1-9
## [87] gridtext_0.1.6 memoise_2.0.1
## [89] bslib_0.11.0 Rcpp_1.1.1-1.1
## [91] gridExtra_2.3 SparseArray_1.13.2
## [93] DESeq2_1.53.0 xfun_0.58
## [95] MatrixGenerics_1.25.0 pkgconfig_2.0.3