TreeSummarizedExperiment 2.17.0
TreeSummarizedExperiment objectsMultiple TreeSummarizedExperiemnt objects (TSE) can be combined by using
rbind or cbind. Here, we create a toy TreeSummarizedExperiment object
using makeTSE() (see ?makeTSE()). As the tree in the row/column tree slot is
generated randomly using ape::rtree(), set.seed() is used to create
reproducible results.
library(TreeSummarizedExperiment)
set.seed(1)
# TSE: without the column tree
(tse_a <- makeTSE(include.colTree = FALSE))
## class: TreeSummarizedExperiment
## dim: 10 4
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
# combine two TSEs by row
(tse_aa <- rbind(tse_a, tse_a))
## class: TreeSummarizedExperiment
## dim: 20 4
## metadata(0):
## assays(1): ''
## rownames(20): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (20 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
The generated tse_aa has 20 rows, which is two times of that in tse_a. The row tree in tse_aa is the same as that in tse_a.
identical(rowTree(tse_aa), rowTree(tse_a))
## [1] TRUE
If we rbind two TSEs (e.g., tse_a and tse_b) that have different row trees, the obtained TSE (e.g., tse_ab) will have two row trees.
set.seed(2)
tse_b <- makeTSE(include.colTree = FALSE)
# different row trees
identical(rowTree(tse_a), rowTree(tse_b))
## [1] FALSE
# 2 phylo tree(s) in rowTree
(tse_ab <- rbind(tse_a, tse_b))
## class: TreeSummarizedExperiment
## dim: 20 4
## metadata(0):
## assays(1): ''
## rownames(20): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (20 rows)
## rowTree: 2 phylo tree(s) (20 leaves)
## colLinks: NULL
## colTree: NULL
In the row link data, the whichTree column gives information about which tree the row is mapped to.
For tse_aa, there is only one tree named as phylo. However, for tse_ab, there are two trees (phylo and phylo.1).
rowLinks(tse_aa)
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE phylo
## entity2 entity2 alias_2 2 TRUE phylo
## entity3 entity3 alias_3 3 TRUE phylo
## entity4 entity4 alias_4 4 TRUE phylo
## entity5 entity5 alias_5 5 TRUE phylo
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE phylo
## entity7 entity7 alias_7 7 TRUE phylo
## entity8 entity8 alias_8 8 TRUE phylo
## entity9 entity9 alias_9 9 TRUE phylo
## entity10 entity10 alias_10 10 TRUE phylo
rowLinks(tse_ab)
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE phylo
## entity2 entity2 alias_2 2 TRUE phylo
## entity3 entity3 alias_3 3 TRUE phylo
## entity4 entity4 alias_4 4 TRUE phylo
## entity5 entity5 alias_5 5 TRUE phylo
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE phylo.1
## entity7 entity7 alias_7 7 TRUE phylo.1
## entity8 entity8 alias_8 8 TRUE phylo.1
## entity9 entity9 alias_9 9 TRUE phylo.1
## entity10 entity10 alias_10 10 TRUE phylo.1
The name of trees can be accessed using rowTreeNames. If the input TSEs use the same name for trees, rbind will automatically create valid and unique names for trees by using make.names. tse_a and tse_b both use phylo as the name of their row trees. In tse_ab, the row tree that originates from tse_b is named as phylo.1 instead.
rowTreeNames(tse_aa)
## [1] "phylo"
rowTreeNames(tse_ab)
## [1] "phylo" "phylo.1"
# The original tree names in the input TSEs
rowTreeNames(tse_a)
## [1] "phylo"
rowTreeNames(tse_b)
## [1] "phylo"
Once the name of trees is changed, the column whichTree in the rowLinks() is updated accordingly.
rowTreeNames(tse_ab) <- paste0("tree", 1:2)
rowLinks(tse_ab)
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE tree1
## entity2 entity2 alias_2 2 TRUE tree1
## entity3 entity3 alias_3 3 TRUE tree1
## entity4 entity4 alias_4 4 TRUE tree1
## entity5 entity5 alias_5 5 TRUE tree1
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE tree2
## entity7 entity7 alias_7 7 TRUE tree2
## entity8 entity8 alias_8 8 TRUE tree2
## entity9 entity9 alias_9 9 TRUE tree2
## entity10 entity10 alias_10 10 TRUE tree2
To run cbind, TSEs should agree in the row dimension. If TSEs only differ in the row tree, the row tree and the row link data are dropped.
cbind(tse_a, tse_a)
## class: TreeSummarizedExperiment
## dim: 10 8
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(8): sample1 sample2 ... sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
cbind(tse_a, tse_b)
## Warning in cbind(...): rowTree & rowLinks differ in the provided TSEs.
## rowTree & rowLinks are dropped after 'cbind'
## class: TreeSummarizedExperiment
## dim: 10 8
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(8): sample1 sample2 ... sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
We obtain a subset of tse_ab by extracting the data on rows 11:15. These rows are mapped to the same tree named as phylo.1. So, the rowTree slot of sse has only one tree.
(sse <- tse_ab[11:15, ])
## class: TreeSummarizedExperiment
## dim: 5 4
## metadata(0):
## assays(1): ''
## rownames(5): entity1 entity2 entity3 entity4 entity5
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (5 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
rowLinks(sse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE tree2
## entity2 entity2 alias_2 2 TRUE tree2
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
[ works not only as a getter but also a setter to replace a subset of sse.
set.seed(3)
tse_c <- makeTSE(include.colTree = FALSE)
rowTreeNames(tse_c) <- "new_tree"
# the first two rows are from tse_c, and are mapped to 'new_tree'
sse[1:2, ] <- tse_c[5:6, ]
rowLinks(sse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
The TSE object can be subset also by nodes or/and trees using subsetByNodes
# by tree
sse_a <- subsetByNode(x = sse, whichRowTree = "new_tree")
rowLinks(sse_a)
## LinkDataFrame with 2 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
# by node
sse_b <- subsetByNode(x = sse, rowNode = 5)
rowLinks(sse_b)
## LinkDataFrame with 2 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity5 entity5 alias_5 5 TRUE tree2
# by tree and node
sse_c <- subsetByNode(x = sse, rowNode = 5, whichRowTree = "tree2")
rowLinks(sse_c)
## LinkDataFrame with 1 row and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE tree2
By using colTree, we can add a column tree to sse that has no column tree before.
colTree(sse)
## NULL
library(ape)
set.seed(1)
col_tree <- rtree(ncol(sse))
# To use 'colTree` as a setter, the input tree should have node labels matching
# with column names of the TSE.
col_tree$tip.label <- colnames(sse)
colTree(sse) <- col_tree
colTree(sse)
##
## Phylogenetic tree with 4 tips and 3 internal nodes.
##
## Tip labels:
## sample1, sample2, sample3, sample4
##
## Rooted; includes branch length(s).
sse has two row trees. We can replace one of them with a new tree by
specifying whichTree of the rowTree.
# the original row links
rowLinks(sse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
# the new row tree
set.seed(1)
row_tree <- rtree(4)
row_tree$tip.label <- paste0("entity", 5:7)
# replace the tree named as the 'new_tree'
nse <- sse
rowTree(nse, whichTree = "new_tree") <- row_tree
rowLinks(nse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_1 1 TRUE new_tree
## entity6 entity6 alias_2 2 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
In the row links, the first two rows now have new values in nodeNum and
nodeLab_alias. The name in whichTree is not changed but the tree is actually
updated.
# FALSE is expected
identical(rowTree(sse, whichTree = "new_tree"),
rowTree(nse, whichTree = "new_tree"))
## [1] FALSE
# TRUE is expected
identical(rowTree(nse, whichTree = "new_tree"),
row_tree)
## [1] TRUE
If nodes of the input tree and rows of the TSE are named differently, users
can match rows with nodes via changeTree with rowNodeLab provided.
sessionInfo()
## R version 4.5.0 Patched (2025-04-21 r88169)
## Platform: aarch64-apple-darwin20
## Running under: macOS Ventura 13.7.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggplot2_3.5.2 ggtree_3.17.0
## [3] ape_5.8-1 TreeSummarizedExperiment_2.17.0
## [5] Biostrings_2.77.0 XVector_0.49.0
## [7] SingleCellExperiment_1.31.0 SummarizedExperiment_1.39.0
## [9] Biobase_2.69.0 GenomicRanges_1.61.0
## [11] GenomeInfoDb_1.45.0 IRanges_2.43.0
## [13] S4Vectors_0.47.0 BiocGenerics_0.55.0
## [15] generics_0.1.3 MatrixGenerics_1.21.0
## [17] matrixStats_1.5.0 BiocStyle_2.37.0
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.52 bslib_0.9.0
## [4] lattice_0.22-7 vctrs_0.6.5 tools_4.5.0
## [7] yulab.utils_0.2.0 parallel_4.5.0 tibble_3.2.1
## [10] pkgconfig_2.0.3 Matrix_1.7-3 ggplotify_0.1.2
## [13] RColorBrewer_1.1-3 lifecycle_1.0.4 GenomeInfoDbData_1.2.14
## [16] farver_2.1.2 compiler_4.5.0 treeio_1.33.0
## [19] tinytex_0.57 codetools_0.2-20 ggfun_0.1.8
## [22] htmltools_0.5.8.1 sass_0.4.10 yaml_2.3.10
## [25] lazyeval_0.2.2 pillar_1.10.2 crayon_1.5.3
## [28] jquerylib_0.1.4 tidyr_1.3.1 BiocParallel_1.43.0
## [31] DelayedArray_0.35.1 cachem_1.1.0 magick_2.8.6
## [34] abind_1.4-8 nlme_3.1-168 aplot_0.2.5
## [37] tidyselect_1.2.1 digest_0.6.37 dplyr_1.1.4
## [40] purrr_1.0.4 bookdown_0.43 labeling_0.4.3
## [43] fastmap_1.2.0 grid_4.5.0 cli_3.6.5
## [46] SparseArray_1.9.0 magrittr_2.0.3 patchwork_1.3.0
## [49] S4Arrays_1.9.0 dichromat_2.0-0.1 withr_3.0.2
## [52] scales_1.4.0 UCSC.utils_1.5.0 rmarkdown_2.29
## [55] httr_1.4.7 evaluate_1.0.3 knitr_1.50
## [58] gridGraphics_0.5-1 rlang_1.1.6 Rcpp_1.0.14
## [61] glue_1.8.0 tidytree_0.4.6 BiocManager_1.30.25
## [64] jsonlite_2.0.0 R6_2.6.1 fs_1.6.6