scLang 0.99.3
scLang is a suite for package development for scRNA-seq analysis.
It offers functions that can operate on both Seurat and
SingleCellExperiment objects. These functions are primarily
aimed to help developers build tools compatible with both types of input.
To install scLang, run the following commands in an R session:
if (!require("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("scLang")
This tutorial uses an scRNA-seq human pancreas dataset. After loading the
required packages, download the dataset using the BaronPancreasData function
from scRNAseq. The dataset will be stored as a SingleCellExperiment object.
library(scLang)
library(scRNAseq)
library(scater)
library(Seurat)
sceObj <- BaronPancreasData('human')
Next, we will normalize and log-transform the data using the logNormCounts
function from scuttle (loaded automatically with scater):
sceObj <- logNormCounts(sceObj)
We will also need PCA and UMAP dimensions. These can be computed using the
runPCA and runUMAP functions from scater:
sceObj <- runPCA(sceObj)
sceObj <- runUMAP(sceObj)
Now we will convert the dataset to a Seurat object:
seuratObj <- as.Seurat(sceObj)
The scExpMat function extracts the expression matrix from a Seurat or
SingleCellExperiment object:
mat1 <- scExpMat(sceObj)
dim(mat1)
#> [1] 20125 8569
mat2 <- scExpMat(seuratObj)
identical(mat1, mat2)
#> [1] TRUE
Note: By default, this function extracts normalized and log-transformed
data, looking for the data assay for a Seurat object, and for the
logcounts assay for a SingleCellExperiment object. This behavior
can be changed using the dataType parameter.
scExpMat can also take a matrix as an argument. This option is useful when
building functions allowing users to use either a single-cell expression matrix
or an object of a dedicated class (Seurat, SingleCellExperiment) as input:
mat2 <- scExpMat(mat1)
identical(mat1, mat2)
#> [1] TRUE
By default, scExpMatconverts the expression matrix to a dense matrix.
If this behavior is not desired, conversion can be skipped by setting densify
to FALSE:
is(mat1)[1]
#> [1] "matrix"
mat2 <- scExpMat(sceObj, densify=FALSE)
is(mat2)[2]
#> [1] "CsparseMatrix"
scExpMatcan also extract the expression data only for selected genes:
mat2 <- scExpMat(sceObj, genes=rownames(sceObj)[seq(30, 80)])
dim(mat2)
#> [1] 51 8569
The scCol function extracts a column from the metadata of the Seurat
object or the coldata of the SingleCellExperiment object:
col1 <- scCol(seuratObj, 'label')
col2 <- scCol(sceObj, 'label')
identical(col1, col2)
#> [1] TRUE
head(col1)
#> [1] "acinar" "acinar" "acinar" "acinar" "acinar" "acinar"
It can also be used to insert a new column. Here, we just make a modified
copy of the label column for the Seurat object:
scCol(seuratObj, 'labelCopy') <- paste0(scCol(seuratObj, 'label'), '_copy')
head(seuratObj[['labelCopy']])
#> labelCopy
#> human1_lib1.final_cell_0001 acinar_copy
#> human1_lib1.final_cell_0002 acinar_copy
#> human1_lib1.final_cell_0003 acinar_copy
#> human1_lib1.final_cell_0004 acinar_copy
#> human1_lib1.final_cell_0005 acinar_copy
#> human1_lib1.final_cell_0006 acinar_copy
The metadataDF function extracts the metadata/coldata data frame from a
Seurat or SingleCellExpression object:
df1 <- metadataDF(seuratObj)
df2 <- metadataDF(sceObj)
identical(df1, df2)
#> [1] FALSE
head(df1)[, c(1, 2)]
#> orig.ident nCount_originalexp
#> human1_lib1.final_cell_0001 human1 22412
#> human1_lib1.final_cell_0002 human1 27953
#> human1_lib1.final_cell_0003 human1 16895
#> human1_lib1.final_cell_0004 human1 19300
#> human1_lib1.final_cell_0005 human1 15067
#> human1_lib1.final_cell_0006 human1 15747
The metadataNames function extracts the column names of the metadata/coldata
data frame from a Seurat or SingleCellExpression object:
colNames1 <- metadataNames(seuratObj)
colNames2 <- metadataNames(sceObj)
identical(colNames1, colNames2)
#> [1] FALSE
head(colNames1)
#> [1] "orig.ident" "nCount_originalexp" "nFeature_originalexp"
#> [4] "donor" "label" "sizeFactor"
The scColCounts and scColPairCounts functions are wrappers around
dplyr::count and enable counting frequencies of elements from one or two
categorical columns in a Seurat or SingleCellExpression object:
freq1 <- scColCounts(sceObj, 'donor')
freq2 <- scColCounts(seuratObj, 'donor')
identical(freq1, freq2)
#> [1] TRUE
head(freq1)
#> GSM2230757 GSM2230758 GSM2230759 GSM2230760
#> 1937 1724 3605 1303
freq1 <- scColPairCounts(sceObj, 'donor', 'label')
freq2 <- scColPairCounts(seuratObj, 'donor', 'label')
identical(freq1, freq2)
#> [1] TRUE
head(freq1)
#> donor label n
#> 1 GSM2230757 acinar 110
#> 2 GSM2230757 activated_stellate 51
#> 3 GSM2230757 alpha 236
#> 4 GSM2230757 beta 872
#> 5 GSM2230757 delta 214
#> 6 GSM2230757 ductal 120
scLang includes three visualization functions that adapt Seurat visualization
tools, extending their usage to SingleCellExpression objects in addition to
Seurat objects.
The dimPlot function mimics the essential behavior of the DimPlot function
from Seurat:
dimPlot(sceObj, groupBy='label')
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
The featurePlot functions mimics the FeaturePlot function from Seurat
(though using a different color scheme):
featurePlot(sceObj, 'SOX4')
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
The violinPlot functions mimics the ViolinPlot function from Seurat:
violinPlot(sceObj, 'SOX4', groupBy='label')
sessionInfo()
#> R version 4.6.0 alpha (2026-04-05 r89794)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] Seurat_5.4.0 SeuratObject_5.3.0
#> [3] sp_2.2-1 scater_1.39.4
#> [5] ggplot2_4.0.2 scuttle_1.21.5
#> [7] scRNAseq_2.25.0 SingleCellExperiment_1.33.2
#> [9] SummarizedExperiment_1.41.1 Biobase_2.71.0
#> [11] GenomicRanges_1.63.2 Seqinfo_1.1.0
#> [13] IRanges_2.45.0 S4Vectors_0.49.1
#> [15] BiocGenerics_0.57.0 generics_0.1.4
#> [17] MatrixGenerics_1.23.0 matrixStats_1.5.0
#> [19] scLang_0.99.3 BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] spatstat.sparse_3.1-0 ProtGenerics_1.43.0 bitops_1.0-9
#> [4] httr_1.4.8 RColorBrewer_1.1-3 sctransform_0.4.3
#> [7] tools_4.6.0 alabaster.base_1.11.4 R6_2.6.1
#> [10] HDF5Array_1.39.1 uwot_0.2.4 lazyeval_0.2.3
#> [13] rhdf5filters_1.23.3 withr_3.0.2 gridExtra_2.3
#> [16] progressr_0.19.0 cli_3.6.5 spatstat.explore_3.8-0
#> [19] fastDummies_1.7.5 labeling_0.4.3 prismatic_1.1.2
#> [22] alabaster.se_1.11.0 sass_0.4.10 spatstat.data_3.1-9
#> [25] S7_0.2.1 ggridges_0.5.7 pbapply_1.7-4
#> [28] Rsamtools_2.27.1 dichromat_2.0-0.1 parallelly_1.46.1
#> [31] RSQLite_2.4.6 BiocIO_1.21.0 spatstat.random_3.4-5
#> [34] ica_1.0-3 dplyr_1.2.1 Matrix_1.7-5
#> [37] ggbeeswarm_0.7.3 abind_1.4-8 lifecycle_1.0.5
#> [40] yaml_2.3.12 rhdf5_2.55.16 SparseArray_1.11.13
#> [43] BiocFileCache_3.1.0 Rtsne_0.17 paletteer_1.7.0
#> [46] grid_4.6.0 blob_1.3.0 promises_1.5.0
#> [49] ExperimentHub_3.1.0 crayon_1.5.3 miniUI_0.1.2
#> [52] lattice_0.22-9 beachmat_2.27.4 cowplot_1.2.0
#> [55] GenomicFeatures_1.63.2 cigarillo_1.1.0 KEGGREST_1.51.1
#> [58] magick_2.9.1 pillar_1.11.1 knitr_1.51
#> [61] abdiv_0.2.0 rjson_0.2.23 future.apply_1.20.2
#> [64] codetools_0.2-20 glue_1.8.0 spatstat.univar_3.1-7
#> [67] data.table_1.18.2.1 vctrs_0.7.2 png_0.1-9
#> [70] gypsum_1.7.0 spam_2.11-3 gtable_0.3.6
#> [73] rematch2_2.1.2 cachem_1.1.0 xfun_0.57
#> [76] S4Arrays_1.11.1 mime_0.13 tidygraph_1.3.1
#> [79] survival_3.8-6 tinytex_0.59 fitdistrplus_1.2-6
#> [82] ROCR_1.0-12 liver_1.28 nlme_3.1-169
#> [85] bit64_4.6.0-1 alabaster.ranges_1.11.0 filelock_1.0.3
#> [88] RcppAnnoy_0.0.23 GenomeInfoDb_1.47.2 bslib_0.10.0
#> [91] irlba_2.3.7 vipor_0.4.7 KernSmooth_2.23-26
#> [94] otel_0.2.0 DBI_1.3.0 tidyselect_1.2.1
#> [97] bit_4.6.0 compiler_4.6.0 curl_7.0.0
#> [100] httr2_1.2.2 BiocNeighbors_2.5.4 h5mread_1.3.3
#> [103] DelayedArray_0.37.1 plotly_4.12.0 bookdown_0.46
#> [106] rtracklayer_1.71.3 scales_1.4.0 lmtest_0.9-40
#> [109] ggeasy_0.1.6 rappdirs_0.3.4 goftest_1.2-3
#> [112] stringr_1.6.0 digest_0.6.39 spatstat.utils_3.2-2
#> [115] alabaster.matrix_1.11.0 rmarkdown_2.31 XVector_0.51.0
#> [118] htmltools_0.5.9 pkgconfig_2.0.3 dbplyr_2.5.2
#> [121] fastmap_1.2.0 ensembldb_2.35.0 rlang_1.2.0
#> [124] htmlwidgets_1.6.4 UCSC.utils_1.7.1 shiny_1.13.0
#> [127] farver_2.1.2 jquerylib_0.1.4 zoo_1.8-15
#> [130] jsonlite_2.0.0 BiocParallel_1.45.0 BiocSingular_1.27.1
#> [133] RCurl_1.98-1.18 magrittr_2.0.5 dotCall64_1.2
#> [136] patchwork_1.3.2 Rhdf5lib_1.33.6 Rcpp_1.1.1
#> [139] ggnewscale_0.5.2 viridis_0.6.5 reticulate_1.45.0
#> [142] stringi_1.8.7 alabaster.schemas_1.11.0 ggalluvial_0.12.6
#> [145] ggraph_2.2.2 MASS_7.3-65 AnnotationHub_4.1.0
#> [148] plyr_1.8.9 parallel_4.6.0 listenv_0.10.1
#> [151] ggrepel_0.9.8 deldir_2.0-4 Biostrings_2.79.5
#> [154] graphlayouts_1.2.3 splines_4.6.0 tensor_1.5.1
#> [157] igraph_2.2.3 spatstat.geom_3.7-3 RcppHNSW_0.6.0
#> [160] reshape2_1.4.5 ScaledMatrix_1.19.0 BiocVersion_3.23.1
#> [163] XML_3.99-0.23 evaluate_1.0.5 henna_0.7.5
#> [166] BiocManager_1.30.27 tweenr_2.0.3 httpuv_1.6.17
#> [169] RANN_2.6.2 tidyr_1.3.2 purrr_1.2.1
#> [172] polyclip_1.10-7 future_1.70.0 scattermore_1.2
#> [175] alabaster.sce_1.11.0 ggforce_0.5.0 rsvd_1.0.5
#> [178] xtable_1.8-8 restfulr_0.0.16 AnnotationFilter_1.35.0
#> [181] RSpectra_0.16-2 later_1.4.8 viridisLite_0.4.3
#> [184] tibble_3.3.1 memoise_2.0.1 beeswarm_0.4.0
#> [187] AnnotationDbi_1.73.1 GenomicAlignments_1.47.0 cluster_2.1.8.2
#> [190] globals_0.19.1