Python module: scrublet — perform.scrublet • IBRAP

Removes doublets from dataset.

perform.scrublet(counts = counts, expected_doublet_rate = 0.025)

Arguments

counts: Counts matrix
total_counts: Total number of cells. NULL = automatically counts.
sim_doublet_ratio: Number of doublets to simulate relative to observed
n_neighbors: Expected number of neighbours per cell
expected_doublet_rate: Expected percentage of doublets to be present in the dataset
stdev_doublet_rate: Uncertainty in expected doublet rate
random_state: Random state for doublet simulation, approximate nearest neighbour search, nd PCA/Truncated PCA
synthetic_doublet_umi_subsampling: Sampling rate for UMIs in a cell when synthesising doublets
use_approx_neighbors: Use approximate nearest neighbor method `(annoy)` for the KNN classifier
distance_metric: Define distance metric for nearest neighbour calculation: 'angular', 'euclidean', 'manhattan', 'hamming', 'dot'.
get_doublet_neighbor_parents: return the transcriptomes of the parent cells for simulated doublets
min_counts: Minimum counts per cell
min_cells: Minimum number of cells per gene
min_gene_variability_pctl: Variability cutoff when deducing highly variable genes prior to PCA reduction
log_transform: Log transforms the data
mean_center: Should the dataset be centred around the mean
normalize_variance: Should the genes have a total variance of 1
n_prin_comps: Number of principal components to retain
svd_solver: Character. Which SVD solver to use: 'auto', 'full', 'arpack', 'randomized'.
print.plot: Logical. Should doublet plots be printed ? Default = FALSE
verbose: Logical. Should function information be printed to hte terminal? Default = FALSE
seed: Numerical. What seed should be be set. Default = 1234
save.plot: Boolean. Should the automatically genewrated plot be saved? Default = TRUE

Value

Doublet-omitted sparse matrix

Examples


object <- perform.scrublet(counts = counts)
#> Error in is(object = counts, class2 = "matrix"): object 'counts' not found