Remove ambient RNA with CellBender
Mariano Ruz Jurado
Goethe Universitycb.Rmd
Despite advances in optimizing and standardizing droplet-based single-cell omics protocols like single-cell and single-nucleus RNA sequencing (sc/snRNA-seq), these experiments still suffer from systematic biases and background noise. In particular, ambient RNA in snRNA-seq can lead to an overestimation of expression levels for certain genes. Computational tools such as cellbender have been developed to address these biases by correcting for ambient RNA contamination.
We have integrated a wrapper function to run CellBender within the DOtools package. The current implementation supports processing samples generated with CellRanger.
Ambient removal
base <- DOtools:::.example_10x()
#> 📥 Downloading data to /tmp/RtmpaqGbvK/dotools_datasets_1664bd1abc7377
#> ⬇️ Downloading healthy filtered to /tmp/RtmpaqGbvK/dotools_datasets_1664bd1abc7377/healthy/outs/filtered_feature_bc_matrix.h5
#> ⬇️ Downloading healthy raw to /tmp/RtmpaqGbvK/dotools_datasets_1664bd1abc7377/healthy/outs/raw_feature_bc_matrix.h5
#> ⬇️ Downloading disease filtered to /tmp/RtmpaqGbvK/dotools_datasets_1664bd1abc7377/disease/outs/filtered_feature_bc_matrix.h5
#> ⬇️ Downloading disease raw to /tmp/RtmpaqGbvK/dotools_datasets_1664bd1abc7377/disease/outs/raw_feature_bc_matrix.h5
dir.create(file.path(base, "/cellbender"))
DO.CellBender(cellranger_path = base,
output_path = file.path(base, "/cellbender"),
samplenames = c("healthy", "disease"),
cuda = T,
BarcodeRanking = F,
cpu_threads = 38,
epochs = 5) # 150 is default
#> 2025-06-26 17:20:03 - Using existing conda environment at: /home/mariano/.venv/cellbender
#> 2025-06-26 17:20:03 - Running CellBender for 2 sample(s)
#> 2025-06-26 17:20:17 - Running CellBender for sample: healthy
#> 2025-06-26 17:35:33 - Running CellBender for sample: disease
#> 2025-06-26 17:46:32 - Finished running CellBender.
After running the analysis, several files are saved in the
output_folder
, including a summary report to check for any
issues during CellBender execution, individual log files for each
sample, and a commands_Cellbender.txt
file with the exact
command used. The corrected .h5
files can now be used for
downstream analysis.
Session information
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.5.1 (2025-06-13)
#> os Ubuntu 24.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language en
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2025-06-26
#> pandoc 3.4 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#> quarto 1.6.42 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> abind 1.4-8 2024-09-12 [2] CRAN (R 4.5.0)
#> backports 1.5.0 2024-05-23 [2] CRAN (R 4.5.0)
#> basilisk 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> basilisk.utils 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> beachmat 2.24.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> Biobase 2.68.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> BiocGenerics 0.54.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> BiocManager 1.30.26 2025-06-05 [2] CRAN (R 4.5.0)
#> BiocParallel 1.42.1 2025-06-01 [2] Bioconductor 3.21 (R 4.5.0)
#> BiocStyle * 2.36.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> bookdown 0.43 2025-04-15 [2] CRAN (R 4.5.0)
#> broom 1.0.8 2025-03-28 [2] CRAN (R 4.5.0)
#> bslib 0.9.0 2025-01-30 [2] CRAN (R 4.5.0)
#> cachem 1.1.0 2024-05-16 [2] CRAN (R 4.5.0)
#> car 3.1-3 2024-09-27 [2] CRAN (R 4.5.0)
#> carData 3.0-5 2022-01-06 [2] CRAN (R 4.5.0)
#> cli 3.6.5 2025-04-23 [2] CRAN (R 4.5.0)
#> cluster 2.1.8.1 2025-03-12 [5] CRAN (R 4.4.3)
#> codetools 0.2-20 2024-03-31 [5] CRAN (R 4.4.0)
#> cowplot 1.1.3 2024-01-22 [2] CRAN (R 4.5.0)
#> crayon 1.5.3 2024-06-20 [2] CRAN (R 4.5.0)
#> curl 6.3.0 2025-06-06 [2] CRAN (R 4.5.0)
#> data.table 1.17.4 2025-05-26 [2] CRAN (R 4.5.0)
#> DelayedArray 0.34.1 2025-04-17 [2] Bioconductor 3.21 (R 4.5.0)
#> DelayedMatrixStats 1.30.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> deldir 2.0-4 2024-02-28 [2] CRAN (R 4.5.0)
#> desc 1.4.3 2023-12-10 [2] CRAN (R 4.5.0)
#> DESeq2 1.48.1 2025-05-11 [2] Bioconductor 3.21 (R 4.5.0)
#> digest 0.6.37 2024-08-19 [2] CRAN (R 4.5.0)
#> dir.expiry 1.16.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> dotCall64 1.2 2024-10-04 [2] CRAN (R 4.5.0)
#> DOtools * 0.4.0 2025-06-26 [1] Bioconductor
#> dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.5.0)
#> dqrng 0.4.1 2024-05-28 [2] CRAN (R 4.5.0)
#> DropletUtils 1.28.0 2025-04-17 [2] Bioconductor 3.21 (R 4.5.0)
#> edgeR 4.6.2 2025-05-07 [2] Bioconductor 3.21 (R 4.5.0)
#> enrichR 3.4 2025-02-02 [2] CRAN (R 4.5.0)
#> evaluate 1.0.3 2025-01-10 [2] CRAN (R 4.5.0)
#> farver 2.1.2 2024-05-13 [2] CRAN (R 4.5.0)
#> fastDummies 1.7.5 2025-01-20 [2] CRAN (R 4.5.0)
#> fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.5.0)
#> filelock 1.0.3 2023-12-11 [2] CRAN (R 4.5.0)
#> fitdistrplus 1.2-2 2025-01-07 [2] CRAN (R 4.5.0)
#> Formula 1.2-5 2023-02-24 [2] CRAN (R 4.5.0)
#> fs 1.6.6 2025-04-12 [2] CRAN (R 4.5.0)
#> future 1.58.0 2025-06-05 [2] CRAN (R 4.5.0)
#> future.apply 1.20.0 2025-06-06 [2] CRAN (R 4.5.0)
#> generics 0.1.4 2025-05-09 [2] CRAN (R 4.5.0)
#> GenomeInfoDb 1.44.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> GenomeInfoDbData 1.2.14 2025-05-13 [2] Bioconductor
#> GenomicRanges 1.60.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> ggalluvial 0.12.5 2023-02-22 [2] CRAN (R 4.5.0)
#> ggcorrplot 0.1.4.1 2023-09-05 [2] CRAN (R 4.5.1)
#> ggplot2 3.5.2 2025-04-09 [2] CRAN (R 4.5.0)
#> ggpubr 0.6.0 2023-02-10 [2] CRAN (R 4.5.0)
#> ggrepel 0.9.6 2024-09-07 [2] CRAN (R 4.5.0)
#> ggridges 0.5.6 2024-01-23 [2] CRAN (R 4.5.0)
#> ggsignif 0.6.4 2022-10-13 [2] CRAN (R 4.5.0)
#> ggtext 0.1.2 2022-09-16 [2] CRAN (R 4.5.0)
#> globals 0.18.0 2025-05-08 [2] CRAN (R 4.5.0)
#> glue 1.8.0 2024-09-30 [2] CRAN (R 4.5.0)
#> goftest 1.2-3 2021-10-07 [2] CRAN (R 4.5.0)
#> gridExtra 2.3 2017-09-09 [2] CRAN (R 4.5.0)
#> gridtext 0.1.5 2022-09-16 [2] CRAN (R 4.5.0)
#> gtable 0.3.6 2024-10-25 [2] CRAN (R 4.5.0)
#> h5mread 1.0.1 2025-05-21 [2] Bioconductor 3.21 (R 4.5.0)
#> HDF5Array 1.36.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> hms 1.1.3 2023-03-21 [2] CRAN (R 4.5.0)
#> htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.5.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.5.0)
#> httpuv 1.6.16 2025-04-16 [2] CRAN (R 4.5.0)
#> httr 1.4.7 2023-08-15 [2] CRAN (R 4.5.0)
#> ica 1.0-3 2022-07-08 [2] CRAN (R 4.5.0)
#> igraph 2.1.4 2025-01-23 [2] CRAN (R 4.5.0)
#> IRanges 2.42.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> irlba 2.3.5.1 2022-10-03 [2] CRAN (R 4.5.0)
#> jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.5.0)
#> jsonlite 2.0.0 2025-03-27 [2] CRAN (R 4.5.0)
#> KernSmooth 2.23-26 2025-01-01 [5] CRAN (R 4.4.2)
#> knitr 1.50 2025-03-16 [2] CRAN (R 4.5.0)
#> later 1.4.2 2025-04-08 [2] CRAN (R 4.5.0)
#> lattice 0.22-5 2023-10-24 [5] CRAN (R 4.3.3)
#> lazyeval 0.2.2 2019-03-15 [2] CRAN (R 4.5.0)
#> lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.5.0)
#> limma 3.64.1 2025-05-25 [2] Bioconductor 3.21 (R 4.5.0)
#> listenv 0.9.1 2024-01-29 [2] CRAN (R 4.5.0)
#> lmtest 0.9-40 2022-03-21 [2] CRAN (R 4.5.0)
#> locfit 1.5-9.12 2025-03-05 [2] CRAN (R 4.5.0)
#> magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.5.0)
#> MASS 7.3-65 2025-02-28 [5] CRAN (R 4.4.3)
#> Matrix 1.7-3 2025-03-11 [5] CRAN (R 4.4.3)
#> MatrixGenerics 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> matrixStats 1.5.0 2025-01-07 [2] CRAN (R 4.5.0)
#> mime 0.13 2025-03-17 [2] CRAN (R 4.5.0)
#> miniUI 0.1.2 2025-04-17 [2] CRAN (R 4.5.0)
#> nlme 3.1-168 2025-03-31 [5] CRAN (R 4.4.3)
#> openxlsx 4.2.8 2025-01-25 [2] CRAN (R 4.5.0)
#> parallelly 1.45.0 2025-06-02 [2] CRAN (R 4.5.0)
#> patchwork 1.3.0 2024-09-16 [2] CRAN (R 4.5.0)
#> pbapply 1.7-2 2023-06-27 [2] CRAN (R 4.5.0)
#> pillar 1.10.2 2025-04-05 [2] CRAN (R 4.5.0)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.5.0)
#> pkgdown 2.1.3 2025-05-25 [2] CRAN (R 4.5.0)
#> plotly 4.10.4 2024-01-13 [2] CRAN (R 4.5.0)
#> plyr 1.8.9 2023-10-02 [2] CRAN (R 4.5.0)
#> png 0.1-8 2022-11-29 [2] CRAN (R 4.5.0)
#> polyclip 1.10-7 2024-07-23 [2] CRAN (R 4.5.0)
#> prettyunits 1.2.0 2023-09-24 [2] CRAN (R 4.5.0)
#> progress 1.2.3 2023-12-06 [2] CRAN (R 4.5.0)
#> progressr 0.15.1 2024-11-22 [2] CRAN (R 4.5.0)
#> promises 1.3.3 2025-05-29 [2] CRAN (R 4.5.0)
#> purrr 1.0.4 2025-02-05 [2] CRAN (R 4.5.0)
#> R.methodsS3 1.8.2 2022-06-13 [2] CRAN (R 4.5.0)
#> R.oo 1.27.1 2025-05-02 [2] CRAN (R 4.5.0)
#> R.utils 2.13.0 2025-02-24 [2] CRAN (R 4.5.0)
#> R6 2.6.1 2025-02-15 [2] CRAN (R 4.5.0)
#> ragg 1.4.0 2025-04-10 [2] CRAN (R 4.5.0)
#> RANN 2.6.2 2024-08-25 [2] CRAN (R 4.5.0)
#> RColorBrewer 1.1-3 2022-04-03 [2] CRAN (R 4.5.0)
#> Rcpp 1.0.14 2025-01-12 [2] CRAN (R 4.5.0)
#> RcppAnnoy 0.0.22 2024-01-23 [2] CRAN (R 4.5.0)
#> RcppHNSW 0.6.0 2024-02-04 [2] CRAN (R 4.5.0)
#> reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.5.0)
#> reticulate 1.42.0 2025-03-25 [2] CRAN (R 4.5.0)
#> rhdf5 2.52.1 2025-06-08 [2] Bioconductor 3.21 (R 4.5.0)
#> rhdf5filters 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> Rhdf5lib 1.30.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> rjson 0.2.23 2024-09-16 [2] CRAN (R 4.5.0)
#> rlang 1.1.6 2025-04-11 [2] CRAN (R 4.5.0)
#> rmarkdown 2.29 2024-11-04 [2] CRAN (R 4.5.0)
#> ROCR 1.0-11 2020-05-02 [2] CRAN (R 4.5.0)
#> RSpectra 0.16-2 2024-07-18 [2] CRAN (R 4.5.0)
#> rstatix 0.7.2 2023-02-01 [2] CRAN (R 4.5.0)
#> rstudioapi 0.17.1 2024-10-22 [2] CRAN (R 4.5.0)
#> Rtsne 0.17 2023-12-07 [2] CRAN (R 4.5.0)
#> S4Arrays 1.8.1 2025-06-01 [2] Bioconductor 3.21 (R 4.5.0)
#> S4Vectors 0.46.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> sass 0.4.10 2025-04-11 [2] CRAN (R 4.5.0)
#> scales 1.4.0 2025-04-24 [2] CRAN (R 4.5.0)
#> scattermore 1.2 2023-06-12 [2] CRAN (R 4.5.0)
#> sctransform 0.4.2 2025-04-30 [2] CRAN (R 4.5.0)
#> scuttle 1.18.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> sessioninfo 1.2.3 2025-02-05 [2] CRAN (R 4.5.0)
#> Seurat 5.3.0 2025-04-23 [2] CRAN (R 4.5.0)
#> SeuratObject 5.1.0 2025-04-22 [2] CRAN (R 4.5.0)
#> shiny 1.10.0 2024-12-14 [2] CRAN (R 4.5.0)
#> SingleCellExperiment 1.30.1 2025-05-07 [2] Bioconductor 3.21 (R 4.5.0)
#> sp 2.2-0 2025-02-01 [2] CRAN (R 4.5.0)
#> spam 2.11-1 2025-01-20 [2] CRAN (R 4.5.0)
#> SparseArray 1.8.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> sparseMatrixStats 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> spatstat.data 3.1-6 2025-03-17 [2] CRAN (R 4.5.0)
#> spatstat.explore 3.4-3 2025-05-21 [2] CRAN (R 4.5.0)
#> spatstat.geom 3.4-1 2025-05-20 [2] CRAN (R 4.5.0)
#> spatstat.random 3.4-1 2025-05-20 [2] CRAN (R 4.5.0)
#> spatstat.sparse 3.1-0 2024-06-21 [2] CRAN (R 4.5.0)
#> spatstat.univar 3.1-3 2025-05-08 [2] CRAN (R 4.5.0)
#> spatstat.utils 3.1-4 2025-05-15 [2] CRAN (R 4.5.0)
#> statmod 1.5.0 2023-01-06 [2] CRAN (R 4.5.0)
#> stringi 1.8.7 2025-03-27 [2] CRAN (R 4.5.0)
#> stringr 1.5.1 2023-11-14 [2] CRAN (R 4.5.0)
#> SummarizedExperiment 1.38.1 2025-04-30 [2] Bioconductor 3.21 (R 4.5.0)
#> survival 3.8-3 2024-12-17 [5] CRAN (R 4.4.2)
#> systemfonts 1.2.3 2025-04-30 [2] CRAN (R 4.5.0)
#> tensor 1.5 2012-05-05 [2] CRAN (R 4.5.0)
#> textshaping 1.0.1 2025-05-01 [2] CRAN (R 4.5.0)
#> tibble 3.3.0 2025-06-08 [2] CRAN (R 4.5.0)
#> tidyr 1.3.1 2024-01-24 [2] CRAN (R 4.5.0)
#> tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.5.0)
#> tidyverse 2.0.0 2023-02-22 [2] CRAN (R 4.5.0)
#> UCSC.utils 1.4.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> uwot 0.2.3 2025-02-24 [2] CRAN (R 4.5.0)
#> vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.5.0)
#> viridisLite 0.4.2 2023-05-02 [2] CRAN (R 4.5.0)
#> WriteXLS 6.8.0 2025-05-22 [2] CRAN (R 4.5.0)
#> xfun 0.52 2025-04-02 [2] CRAN (R 4.5.0)
#> xml2 1.3.8 2025-03-14 [2] CRAN (R 4.5.0)
#> xtable 1.8-4 2019-04-21 [2] CRAN (R 4.5.0)
#> XVector 0.48.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> yaml 2.3.10 2024-07-26 [2] CRAN (R 4.5.0)
#> zellkonverter 1.18.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> zip 2.3.3 2025-05-13 [2] CRAN (R 4.5.0)
#> zoo 1.8-14 2025-04-10 [2] CRAN (R 4.5.0)
#>
#> [1] /tmp/RtmpISVWTJ/temp_libpath14fb8d3fdbc303
#> [2] /home/mariano/R/x86_64-pc-linux-gnu-library/4.5
#> [3] /usr/local/lib/R/site-library
#> [4] /usr/lib/R/site-library
#> [5] /usr/lib/R/library
#> * ── Packages attached to the search path.
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────