Removes doublets from dataset.

perform.scrublet(counts = counts, expected_doublet_rate = 0.025)

Arguments

counts

Counts matrix

total_counts

Total number of cells. NULL = automatically counts.

sim_doublet_ratio

Number of doublets to simulate relative to observed

n_neighbors

Expected number of neighbours per cell

expected_doublet_rate

Expected percentage of doublets to be present in the dataset

stdev_doublet_rate

Uncertainty in expected doublet rate

random_state

Random state for doublet simulation, approximate nearest neighbour search, nd PCA/Truncated PCA

synthetic_doublet_umi_subsampling

Sampling rate for UMIs in a cell when synthesising doublets

use_approx_neighbors

Use approximate nearest neighbor method `(annoy)` for the KNN classifier

distance_metric

Define distance metric for nearest neighbour calculation: 'angular', 'euclidean', 'manhattan', 'hamming', 'dot'.

get_doublet_neighbor_parents

return the transcriptomes of the parent cells for simulated doublets

min_counts

Minimum counts per cell

min_cells

Minimum number of cells per gene

min_gene_variability_pctl

Variability cutoff when deducing highly variable genes prior to PCA reduction

log_transform

Log transforms the data

mean_center

Should the dataset be centred around the mean

normalize_variance

Should the genes have a total variance of 1

n_prin_comps

Number of principal components to retain

svd_solver

Character. Which SVD solver to use: 'auto', 'full', 'arpack', 'randomized'.

print.plot

Logical. Should doublet plots be printed ? Default = FALSE

verbose

Logical. Should function information be printed to hte terminal? Default = FALSE

seed

Numerical. What seed should be be set. Default = 1234

save.plot

Boolean. Should the automatically genewrated plot be saved? Default = TRUE

Value

Doublet-omitted sparse matrix

Examples


object <- perform.scrublet(counts = counts)
#> Error in is(object = counts, class2 = "matrix"): object 'counts' not found