Parse_sn

1 Introduction

Comprehensive quality control (QC) of single-cell RNA-seq data was performed with the singleCellTK package. This report contains information about each QC tool and visualization of the QC metrics for each sample. For more information on running this pipeline and performing quality control, see the documentation. If you use the singleCellTK package for quality control, please include a reference in your publication.

2 Summary Statistics

2.1 SCTK-QC

All Samples
Number of Cells 3499
Mean counts 11333
Median counts 6038
Mean features detected 2981.8
Median features detected 2407
scDblFinder - Number of doublets 290
scDblFinder - Percentage of doublets 8.29
DecontX - Mean contamination 0.097
DecontX - Median contamination 0.0662

The summary statistics table summarizes QC metrics of the cell matrix. This table summarizes the mean and median of UMI counts and median of genes detected per cell, as well as the number and percentages of doublets and estimated ambient RNA scores per dataset.

3 General quality control metrics

SingleCellTK utilizes the scater package to compute cell-level QC metrics. The wrapper function runPerCellQC can be used to separately compute QC metrics on its own. The wrapper function plotRunPerCellQCResults can be used to plot the general QC outputs. The QC outputs are sum, detected, and percent_top_X. sum contains the total number of counts for each cell. detected contains the total number of features for each cell. percent_top_X contains the percentage of the total counts that is made up by the expression of the top X genes for each cell. The subsets_ columns contain information for the specific gene list that was used. For instance, if a gene list containing mitochondrial genes named mito was used, subsets_mito_sum would contains the total number of mitochondrial counts for each cell.

3.1 Total Counts

3.2 Total Features

3.3 Percentage of Library Size Occupied by Top 50 Expressed Features

3.4 Total Mitochondrial Counts

3.5 Total Mitochondrial Features

3.6 Percentage of Mitochondrial Counts

3.7 Parameters

In this function, the inSCE parameter is the input SingleCellExperiment object, while the useAssay parameter is the assay object that in the SingleCellExperiment object the user wishes to use.

4 Doublet Detection

4.1 Doublet Detection Summary

4.1.1 scDblFinder

4.2 ScDblFinder

scDblFinder is a doublet detection algorithm in the scran package. scDblFinder aims to detect doublets by creating a simulated doublet from existing cells and projecting it to the same PCA space as the cells. The wrapper function runScDblFinder can be used to separately run the scDblFinder algorithm on its own. The wrapper function plotScDblFinderResults can be used to plot the QC outputs from the scDblFinder algorithm. The output of scDblFinder is a scDblFinder_doublet_score and scDblFinder_doublet_call. The doublet score of a droplet will be higher if the it is deemed likely to be a doublet.

4.2.1 sample_47053

4.2.1.1 ScDblFinder Doublet Assignment

4.2.1.2 ScDblFinder Doublet Score

4.2.1.3 Density Score

4.2.1.4 Violin Score

4.2.1.5 Parameters

The nNeighbors parameter is the number of nearest neighbor used to calculate the density for doublet detection. simDoublets is used to determine the number of simulated doublets used for doublet detection.

5 Ambient RNA Detection

5.1 Ambient RNA Detection Summary

5.1.1 decontX

5.2 DecontX

In droplet-based single cell technologies, ambient RNA that may have been released from apoptotic or damaged cells may get incorporated into another droplet, and can lead to contamination. decontX, available from the celda, is a Bayesian method for the identification of the contamination level at a cellular level. The wrapper function runDecontX can be used to separately run the DecontX algorithm on its own. The wrapper function plotDecontXResults can be used to plot the QC outputs from the DecontX algorithm. The outputs of runDecontX are decontX_contamination and decontX_clusters. decontX_contamination is a numeric vector which characterizes the level of contamination in each cell. Clustering is performed as part of the runDecontX algorithm. decontX_clusters is the resulting cluster assignment, which can also be labeled on the plot.

5.2.1 sample_47053

5.2.1.1 DecontX Contamination Score

5.2.1.2 DecontX Clusters

5.2.1.3 Density Score

5.2.1.4 Violin Score

5.2.1.5 Parameters

z NULL
maxIter 500
delta 10 10
estimateDelta TRUE
convergence 0.001
varGenes 5000
dbscanEps 1
logfile NULL
verbose TRUE
packageVersion 1.18.1




6 Session Information

Session Information
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: CentOS Linux 7 (Core)
## 
## Matrix products: default
## BLAS/LAPACK: /common/software/install/migrated/openblas/0.3.5_gcc8.2.0_multiarch/lib/libopenblasp-r0.3.5.so;  LAPACK version 3.8.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: US/Central
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] cowplot_1.1.3               dplyr_1.1.4                
##  [3] ggplot2_3.5.0               singleCellTK_2.12.2        
##  [5] DelayedArray_0.28.0         SparseArray_1.2.4          
##  [7] S4Arrays_1.2.1              abind_1.4-5                
##  [9] Matrix_1.6-5                SingleCellExperiment_1.24.0
## [11] SummarizedExperiment_1.32.0 Biobase_2.62.0             
## [13] GenomicRanges_1.54.1        GenomeInfoDb_1.38.8        
## [15] IRanges_2.36.0              S4Vectors_0.40.2           
## [17] BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
## [19] matrixStats_1.2.0          
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7              gridExtra_2.3            
##  [3] rlang_1.1.3               magrittr_2.0.3           
##  [5] scater_1.30.1             compiler_4.3.0           
##  [7] DelayedMatrixStats_1.24.0 systemfonts_1.0.5        
##  [9] png_0.1-8                 vctrs_0.6.5              
## [11] reshape2_1.4.4            stringr_1.5.1            
## [13] pkgconfig_2.0.3           crayon_1.5.2             
## [15] fastmap_1.1.1             XVector_0.42.0           
## [17] labeling_0.4.3            scuttle_1.12.0           
## [19] utf8_1.2.4                rmarkdown_2.26           
## [21] ggbeeswarm_0.7.2          xfun_0.43                
## [23] zlibbioc_1.48.2           cachem_1.0.8             
## [25] beachmat_2.18.1           jsonlite_1.8.8           
## [27] highr_0.10                rhdf5filters_1.14.1      
## [29] Rhdf5lib_1.24.2           BiocParallel_1.36.0      
## [31] irlba_2.3.5.1             parallel_4.3.0           
## [33] R6_2.5.1                  bslib_0.7.0              
## [35] stringi_1.8.3             limma_3.58.1             
## [37] reticulate_1.35.0         jquerylib_0.1.4          
## [39] Rcpp_1.0.12               knitr_1.45               
## [41] R.utils_2.12.3            FNN_1.1.4                
## [43] eds_1.4.0                 tidyselect_1.2.1         
## [45] rstudioapi_0.15.0         yaml_2.3.8               
## [47] viridis_0.6.5             codetools_0.2-19         
## [49] lattice_0.21-8            tibble_3.2.1             
## [51] plyr_1.8.9                withr_3.0.0              
## [53] evaluate_0.23             GSVAdata_1.38.0          
## [55] xml2_1.3.5                pillar_1.9.0             
## [57] generics_0.1.3            RCurl_1.98-1.14          
## [59] sparseMatrixStats_1.14.0  munsell_0.5.1            
## [61] scales_1.3.0              glue_1.7.0               
## [63] tools_4.3.0               BiocNeighbors_1.20.2     
## [65] ScaledMatrix_1.10.0       locfit_1.5-9.9           
## [67] rhdf5_2.46.1              grid_4.3.0               
## [69] DropletUtils_1.22.0       edgeR_4.0.16             
## [71] colorspace_2.1-0          GenomeInfoDbData_1.2.11  
## [73] beeswarm_0.4.0            BiocSingular_1.18.0      
## [75] HDF5Array_1.30.1          vipor_0.4.7              
## [77] cli_3.6.2                 rsvd_1.0.5               
## [79] kableExtra_1.4.0          fansi_1.0.6              
## [81] viridisLite_0.4.2         svglite_2.1.3            
## [83] uwot_0.1.16               gtable_0.3.4             
## [85] R.methodsS3_1.8.2         sass_0.4.9               
## [87] digest_0.6.35             ggrepel_0.9.5            
## [89] dqrng_0.3.2               farver_2.1.1             
## [91] htmltools_0.5.8.1         R.oo_1.26.0              
## [93] lifecycle_1.0.4           statmod_1.5.0