API

Pre-build pipelines

The following wrapper function creates plots to diagnose the spatial kernel used in SpatialCorr’s statistical tests.

spatialcorr.kernel_diagnostics(adata, cond_key, bandwidth, contrib_thresh=10, row_key='row', col_key='col', dsize=12, fpath=None, fformat='pdf', dpi=150)

Create plot to visualize the spatial kernel used for SpatialCorr’s statistical analyses.

This function will plot the following analyses: Top left: The annotated regions/clusters Bottom left: The kernel weights at a randomly chosen spot (i.e., a row of the kernel matrix) Top middle: The effective number of samples used to estimate correlation at each spot (i.e., the sum of each row of the kernel matrix) Bottom middle: The spots that would be filtered when applying an effective spots threshold of contrib_thresh (shown in grey) Top right: A distribution of the effective number of samples used to estimate correlation at each spot across the entire slide. The red verticle line shows the effective samples threshold set by contrib_thresh

Parameters
adataAnnData

spatial gene expression dataset with spatial coordinates stored in adata.obs

cond_keystring

the name of the column in adata.obs storing the cluster assignments

bandwidthint

the kernel bandwidth used by the test

contrib_threshint, optional (default: 10)

threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test

row_keystring, optional (default: ‘row’)

the name of the column in adata.obs storing the row coordinates of each spot

col_keystring, optional (default: ‘col’)

the name of the column in adata.obs storing the column coordinates of each spot

dsize: int, optional (default: 12)

the size of the dots in the scatterplot

fpath: string, optional (default: None)

Path to write figure image.

fformat: string, {‘pdf’, ‘png’} (default: ‘pdf’)

Format of the output figure file.

dpi: int (default: 150)

Resolution of output image.

Returns
None

It outputs the following multi-panel figure:

The following wrapper function implements a full analysis pipeline for investigating spatially varying correlation between a pair of genes.

spatialcorr.analysis_pipeline_pair(gene_1, gene_2, adata, bandwidth, cond_key, row_key='row', col_key='col', reject_thresh=0.05, dsize=12, max_perms=500, n_procs=5, contrib_thresh=10, verbose=1, fig_path=None, fig_format='pdf', dpi=150, only_stats=False)

Run a SpatialCorr analysis pipeline on a pair of genes.

This function will run the following analyses: 1. Compute spotwise kernel estimates of correlation 2. Compute confidence intervals (CIs) of correlation at each spot compute spots where CI does not overlap zero (i.e. putative regions with non-zero correlation) 3. For each cluster, compute a WR P-value 4. Remove all clusters with WR P-value < reject_thresh for BR-test and for remaining clusters, compute BR P-value testing for differential correlation between the two clusters

Parameters
gene_1: string

The first gene of the pair to analyze

gene_2: string

The second gene of the pair to analyze

adataAnnData

spatial gene expression dataset with spatial coordinates stored in adata.obs

bandwidthint

the kernel bandwidth used by the test

cond_keystring

the name of the column in adata.obs storing the cluster assignments

row_keystring, optional (default: ‘row’)

the name of the column in adata.obs storing the row coordinates of each spot

col_keystring, optional (default: ‘col’)

the name of the column in adata.obs storing the column coordinates of each spot

reject_thresh: float (default: 0.05)

P-value threshold used to reject the null hypothesis for each region’s WR-test as well as region-pairwise BR-tests.

dsize: int, optional (default: 12)

the size of the dots in the scatterplot

max_permsint, optional (default: 500)

Maximum number of permutations to compute for the permutation test

n_procsint, optional (default: 1)

number of processes to run in parallel

verboseint, optional (default: 1)

the verbosity. Higher verbosity will lead to more debugging information printed to standard output

contrib_threshint, optional (default: 10)

threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test

fig_path: string, optional (default: None)

Path to write figure image.

fig_format: string, {‘pdf’, ‘png’} (default: ‘pdf’)

Format of the output figure file.

dpi: int (default: 150)

Resolution of output image.

Returns
None

It outputs the following multi-panel figure:

The following wrapper function implements a full analysis pipeline for investigating spatially varying correlation between a set of genes.

spatialcorr.analysis_pipeline_set(genes, adata, cond_key, row_key='row', col_key='col', reject_thresh=0.05, dsize=12, bandwidth=5, max_perms=500, n_procs=5, run_br=False, spot_to_neighbors=None, spot_to_neighbors_clust=None, contrib_thresh=10, verbose=1, fig_path=None, fig_format='pdf', dpi=150)

Run a SpatialCorr analysis pipeline on a set of genes.

This function will run the following analyses: 1. For each cluster, compute a WR P-value 2. Remove all clusters with WR P-value < reject_thresh for BR-test and for remaining clusters, compute BR P-value testing for differential correlation between the two clusters

Parameters
genes: List

List of genes in the gene set

adataAnnData

spatial gene expression dataset with spatial coordinates stored in adata.obs

bandwidthint

the kernel bandwidth used by the test

cond_keystring

the name of the column in adata.obs storing the cluster assignments

row_keystring, optional (default: ‘row’)

the name of the column in adata.obs storing the row coordinates of each spot

col_keystring, optional (default: ‘col’)

the name of the column in adata.obs storing the column coordinates of each spot

reject_thresh: float (default: 0.05)

P-value threshold used to reject the null hypothesis for each region’s WR-test as well as region-pairwise BR-tests.

dsize: int, optional (default: 12)

the size of the dots in the scatterplot

max_permsint, optional (default: 500)

Maximum number of permutations to compute for the permutation test

n_procsint, optional (default: 1)

number of processes to run in parallel

verboseint, optional (default: 1)

the verbosity. Higher verbosity will lead to more debugging information printed to standard output

contrib_threshint, optional (default: 10)

threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test

fig_path: string, optional (default: None)

Path to write figure image.

fig_format: string, {‘pdf’, ‘png’} (default: ‘pdf’)

Format of the output figure file.

dpi: int (default: 150)

Resolution of output image.

Returns
None

It outputs the following multi-panel figure:

Statistical

spatialcorr.run_test(adata, test_genes, bandwidth, run_br=False, cond_key=None, contrib_thresh=10, row_key='row', col_key='col', precomputed_kernel=None, verbose=1, n_procs=1, compute_spotwise_pvals=True, standardize_var=False, max_perms=10000, mc_pvals=True, spot_to_neighbors=None, alpha=0.05, compute_gene_pair_pvals=False, gene_pair_perms=100)

Run the SpatialCorr statistical test to identify spatially varying correlation for a given set of genes.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

test_geneslist

List of gene names for which to test for spatially varying correlation.

bandwidthint

The kernel bandwidth used by the test.

run_br: boolean, default: False

If False, run the WHR-test. If True, run the BHR-test

cond_keystring

The name of the column in adata.obs storing the cluster assignments.

contrib_threshinteger, optional (default: 10)

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

verboseint, optional (default: 1)

The verbosity. Higher verbosity will lead to more debugging information printed to standard output.

n_procsint, optional (default: 1)

Number of processes to run in parallel.

max_permsint, optional (default: 10000)

Maximum number of permutations to compute for the permutation test.,

mc_pvalsboolean, optional (default: True)

If True, use Sequential Monte Carlo P-values. If False, use max_perms number of permutations.

Returns
p_val: float

A permutation p-value for the log-likelihood ratio test.

additional: dict

A dictionary of additional information computed during the test. If run_br is False, the region-specific p-values are located in additional[‘region_to_p_val’]. The FDR-adjusted p-values (via Benjamini Hochberg) are stored in additional[‘region_to_adj_p_val’].

spatialcorr.run_test_between_region_pairs(adata, test_genes, bandwidth, cond_key, contrib_thresh=10, row_key='row', col_key='col', verbose=1, n_procs=1, standardize_var=False, max_perms=10000, mc_pvals=True, spot_to_neighbors=None, run_regions=None, clust_size_lim=0)

Run the SpatialCorr BR-test between very pair of regions on the slide.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

test_geneslist

List of gene names for which to test for spatially varying correlation.

bandwidthint

The kernel bandwidth used by the test.

cond_keystring

The name of the column in adata.obs storing the cluster assignments.

contrib_threshinteger, optional (default: 10)

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

verboseint, optional (default: 1)

The verbosity. Higher verbosity will lead to more debugging information printed to standard output.

n_procsint, optional (default: 1)

number of processes to run in parallel

standardize_var: Boolean (default: False)

If true, standardize the variance between regions (in additon to the means) before running the BR-test.

max_permsint, optional (default: 10000)

Maximum number of permutations to compute for the permutation test.

mc_pvalsboolean, optional (default: True)

If True, use Sequential Monte Carlo P-values. If False, use max_perms number of permutations.

Returns
reg_to_reg_to_pval: dictionary

A dictionary of dictionaries mapping each region-pair to its pairwise BR-test p-value.

spatialcorr.est_corr_cis(gene_1, gene_2, adata, bandwidth, cond_key, precomputed_kernel=None, confidence_interval=0.95, spot_to_neighs=None, neigh_thresh=10, n_boots=100, row_key='row', col_key='col')

Compute approximate confidence intervals around the kernel estimates of spot wise correlation.

Parameters
gene_1: string

Name or id of first gene.

gene_2: string

Name or id of second gene.

adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

bandwidthint

The kernel bandwidth used for the kernel estimates of correlation at each spot.

cond_keystring

The name of the column in adata.obs storing the cluster assignments.

precomputed_kernelArray (default: None)

An NxN array storing a precomputed kernel matrix, where N is the number of spots. If None a kernel will be computed using the bandwidth parameter and conditioning on cond_key.

confidence_intervalfloat (default: 0.95)

Confidence interval to compute for each spot.

spot_to_neighs: dict, optional (default: None)

A dictionary mapping each spot to a list of neighboring spots. If not provided, this will be computed automatically.

neigh_threshinteger, optional (default: 10)

Threshold for the total number of neighbors contributing to the correlation estimate at each spot. Spots with total neighbors less than this value will be filtered prior to running the test.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

Returns
cis: list

A list of pairs, one for each kept spot after filtering, storing the confidence interval boundaries.

keep_inds: list

A list of kept indices after applying the effective-neighbors threshold. The confidence intervals in cis correspond to these spots.

Plotting

spatialcorr.plot.plot_correlation(adata, gene_1, gene_2, bandwidth=5, contrib_thresh=10, kernel_matrix=None, row_key='row', col_key='col', condition=None, cmap='RdBu_r', colorbar=True, ticks=True, ax=None, figure=None, dsize=10, estimate='local', title=None, spot_borders=False, border_color='black', border_size=0.3, fig_path=None, fig_format='pdf', fig_dpi=150)

Plot the slide with each spot colored by the correlation between two genes.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

gene_1string

The name or ID of the first gene.

gene_2string

The name or ID of the second gene.

estimatestring, optional (default

One of {‘local’, ‘regional’}. The estimation method used to estimate the correlation at each spot. If ‘local’, use Gaussian kernel estimation. If ‘regional’, use all of the spots in the given spot’s histological region.

kernel_matrixndarray, optional (default

NxN matrix representing the spatial kernel (i.e., pairwise weights between spatial locations). If not provided, one will be computed using the bandwidth and contrib_thresh arguments.

bandwidthint, optional (default

The kernel bandwidth used by the test. Only applied if estimate is set to ‘local’. Only applied if kernel_matrix is not provided.

contrib_threshinteger, optional (default: 10)

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered. Only applied if estimate is set to ‘local’. Only applied if kernel_matrix is not provided.

row_keystring, optional (default

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default

The name of the column in adata.obs storing the column coordinates of each spot.

conditionstring, optional (default

The name of the column in adata.obs storing the cluster assignments.

cmapstring (default

The colormap to use to color the spots.

colorbarboolean (default

If True, plot the colorbar next to the figure.

ticksboolean (default: True)

If True, show tickmarks along x and y axes indicated spatial coordinates.

dsizeint (default

The size of the dots in the scatterplot.

titlestring (default

The plot title.

spot_bordersboolean (default

If True, draw a border line around each spot.

border_colorstring (default

The color of the border line around each spot. Only used if spot_borders is True.

border_sizefloat (default

The thickness of the border line around each spot. Only used if spot_borders is True.

ticksboolean (default: True)

If True, show tickmarks along x and y axes indicated spatial coordinates.

fig_pathstring, optional (default

Path to save figure as file.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
None

spatialcorr.plot.plot_ci_overlap(adata, gene_1, gene_2, cond_key='cluster', kernel_matrix=None, bandwidth=5, row_key='row', col_key='col', title=None, ax=None, figure=None, ticks=False, dsize=12, colorticks=None, neigh_thresh=10, fig_path=None, fig_format='pdf', fig_dpi=150)

Plot the spots and color each spot whether the 95% confidence interval of the Guassian estimate of correlation overlaps zero (computed using the bootstrap with 100 hundred sampels). A spot is colored red if the CI lies entirely above zero, blue if the CI lies entirely below zero, and grey if the CI overlaps zero.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

gene_1string

The name or ID of the first gene.

gene_2string

The name or ID of the second gene.

kernel_matrixndarray, optional (default

NxN matrix representing the spatial kernel (i.e., pairwise weights between spatial locations)

bandwidthint, optional (default

The kernel bandwidth used by the test. Only applied if estimate is set to ‘local’. Only applied if kernel_matrix is set to None.

neigh_threshinteger, optional (default: 10)

Threshold for the total number of neighbors contributing to the correlation estimate at each spot. Spots with total neighbors less than this value will be filtered prior to running the test.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

cond_keystring (default

The name of the column in adata.obs storing the cluster assignments.

ticksboolean (default: True)

If True, show tickmarks along x and y axes indicated spatial coordinates.

dsizeint (default

The size of the dots in the scatterplot.

titlestring (default

The plot title.

fig_pathstring, optional (default

Path to save figure as file.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
None

spatialcorr.plot.plot_local_scatter(adata, gene_1, gene_2, row, col, plot_vals, color_spots=None, condition=None, vmin=None, vmax=None, row_key='row', col_key='col', cmap='RdBu_r', neighb_color='black', plot_neigh=True, width=10, height=5, dsize=15, line_color='black', scatter_xlim=None, scatter_ylim=None, scatter_xlabel=None, scatter_ylabel=None, scatter_title=None, fig_path=None, fig_format='pdf', fig_dpi=150)

Plot the spots colored according to some specified values and, for a given spot, plot the expression scatterplot between two genes in the neighborhood of the given spot. Also draws an ordinary least squares regression line atop this scatterplot.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

gene_1string

The name or ID of the first gene.

gene_2string

The name or ID of the second gene.

rowint

The row-coordinate to center the neighborhood.

colint

The column-coordinate to center the neighborhood.

plot_valsndarray

An N-length array of values used to color each spot where N is the total number of spots (i.e., length of adata).

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

conditionstring, optional (default

The name of the column in adata.obs storing the cluster assignments.

vminfloat, optional (default

Minimum value used to color the spots (i.e., the lower limit of the colors).

vmaxfloat, optional (default

Maximum value used to color the spots (i.e., the lower limit of the colors).

cmapstring, optional (default

The colormap to use to color the spots.

plot_neighboolean, optional (default

If True, outline the spots that are included in the neighborhood.

neighb_colorstring (default

Color used to color the neighborhood of spots on the slide. Only applied if plot_neigh is True.

widthfloat, optional (default

Figure width.

heightfloat, optional (default

Figure height.

dsizefloat, optional (default

Size of each spot.

line_colorstring, optional (default

Color used for the regression line.

scatter_xlimfloat, optional (default

X-axis limits of regression plot.

scatter_ylimfloat, optional (default

Y-axis limits of regression plot.

scatter_xlabelstring, optional (default

X-axis label for regression plot.

scatter_ylabelstring, optional (default

Y-axis label for regression plot.

scatter_titlestring, optional (default

Title for regression plot.

fig_pathstring, optional (default

Path to save figure as file.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
None

spatialcorr.plot.region_scatterplots(gene_1, gene_2, adata, cond_key, row_key='row', col_key='col', xlim=None, ylim=None, fig_path=None, fig_format='png', fig_dpi=150)

For a given pair of genes, plot the scatterplot of expression values of these two genes for each histological region.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

gene_1string

The name or ID of the first gene.

gene_2string

The name or ID of the second gene.

cond_keystring, optional (default

The name of the column in adata.obs storing the cluster assignments.

row_keystring, optional (default

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default

The name of the column in adata.obs storing the column coordinates of each spot.

cond_keystring, optional (default

The name of the column in adata.obs storing the cluster assignments.

xlimtuple, optional (default: None)

The x-axis limits for each scatterplot.

ylimtuple, optional (default: None)

The y-axis limits for each scatterplot.

fig_pathstring, optional (default

The path to the file to which to save the figure.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
——
None

spatialcorr.plot.mult_genes_plot_correlation(adata, plot_genes, cond_key, estimate='local', bandwidth=5, kernel_matrix=None, contrib_thresh=10, row_key='row', col_key='col', dsize=7, fig_path=None, fig_format='png', fig_dpi=150)

Create a grid of plots for displaying the correlations between pairs of genes across all spots. That is, each spot in the grid displays the spot-specific correlation between a given pair of genes.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

plot_geneslist

List of gene names or IDs. This function will consider the spot-specific correlation for every pair of genes in this list.

estimatestring, optional (default

One of {‘local’, ‘regional’, ‘local_ci’}. The estimation method used to estimate the correlation at each spot. If ‘local’, use Gaussian kernel estimation. If ‘regional’, use all of the spots in the given spot’s histological region. If ‘local_ci’ is used, then each spot will be colored based on whether the 95% confidence interval of the Gaussian kernel estimate overlaps zero.

kernel_matrixndarray, optional (default

NxN matrix representing the spatial kernel (i.e., pairwise weights between spatial locations). If not provided, one will be computed using the bandwidth and contrib_thresh arguments. Only applied if estimate is set to ‘local’ or ‘local_ci’.

bandwidthint, optional (default

The kernel bandwidth used by the test. Only applied if estimate is set to ‘local’. Only applied if kernel_matrix is not provided and estimate is set to ‘local’ or ‘local_ci’.

contrib_threshinteger, optional (default: 10)

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered. Only applied if estimate is set to ‘local’. Only applied if kernel_matrix is not provided and estimate is set to ‘local’ or ‘local_ci’.

row_keystring, optional (default

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default

The name of the column in adata.obs storing the column coordinates of each spot.

dsizeint, optional (default

The size of the dots in each plot.

fig_pathstring, optional (default

Path to save figure as file.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
None

spatialcorr.plot.cluster_pairwise_correlations(adata, plot_genes, cond_key, bandwidth=5, row_key='row', col_key='col', color_thresh=19, title=None, remove_y_ticks=False, fig_path=None, fig_size=(6, 4), fig_format='png', fig_dpi=150)

Cluster the patterns of correlations across all spots between pairs of genes. Plot a dendrogram of the clustering. Each leaf in the dendrogram represents a single pair of genes. Two pairs will cluster together if their pattern of correlation, across all of the spots, are similar.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

plot_geneslist

List of gene names or IDs. This function will consider the spot-specific correlation for every pair of genes in this list.

color_threshfloat, optional, default: 19

The value along the y-axis of the dendrogram to use as a threshold for coloring the subclusters. The sub-dendrograms below this threshold will be given unique colors. The part of the dendrogram lying above this threshold will be colored grey.

row_keystring, optional (default

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default

The name of the column in adata.obs storing the column coordinates of each spot.

cond_keystring, optional (default

The name of the column in adata.obs storing the cluster assignments.

fig_pathstring, optional (default

The path to the file to which to save the figure.

fig_sizetuple, optional (default

Figure height and width.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
——
None

spatialcorr.plot.plot_filtered_spots(adata, kernel_matrix, contrib_thresh, row_key='row', col_key='col', ax=None, figure=None, dsize=37, ticks=True, fig_path=None, fig_format='pdf', fig_dpi=150)

Plot the slide with spots colored according to whether they would be filtered according to the effective-neighbors filter. The effective-neighbors filter removes spots for which the sum of the weights applied to neighboring spots, according to the Gaussian kernel, do not exceed a specified threshold.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

kernel_matrixndarray

NxN matrix representing the spatial kernel (i.e., pairwise weights between spatial locations)

contrib_threshinteger, optional (default: 10)

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

axAxis (default: None)

Draw plot on provided Matplotlib Axis.

figureFigure (default

Draw plot on provided Matplotlib Figure.

dsizeint (default

The size of the dots in the scatterplot.

ticksboolean (default: True)

If True, show tickmarks along x and y axes indicated spatial coordinates.

fig_pathstring, optional (default

Path to save figure as file.

fig_formatstring, optional (default

File format to save figure.

fig_dpistring, optional (default

Resolution of figure.

Returns
None

spatialcorr.plot.plot_slide(df, values, cmap='viridis', colorbar=False, vmin=None, vmax=None, title=None, ax=None, figure=None, ticks=True, dsize=37, colorticks=None, row_key='row', col_key='col', cat_palette=None, spot_borders=False, border_color='black', border_size=0.3)

Plot the slide with each spot colored according to a specified set of values.

Parameters
dfDataFrame

A pandas DataFrame storing the coordinates for each spot.

valuesndarray

An N-length array of values, corresponding to the N spots, that should be used to color each spot.

row_keystring, optional (default: ‘row’)

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default: ‘col’)

The name of the column in adata.obs storing the column coordinates of each spot.

cmapstring, optional (default

The colormap to use to color the spots. If the values array of values are discrete categories, then one can supply the argument categorical.

cat_palette, optional (default

A palette (list) of colors to use for coloring categorical values. Only applied if cmap is set to ‘categorical’.

colorbarboolean, optional (default

If True, plot the colorbar next to the figure.

ticksboolean (default: True)

If True, show tickmarks along x and y axes indicated spatial coordinates.

dsizeint (default

The size of the dots in the scatterplot.

titlestring (default

The plot title.

spot_bordersboolean (default

If True, draw a border line around each spot.

border_colorstring (default

The color of the border line around each spot. Only used if spot_borders is True.

border_sizefloat (default

The thickness of the border line around each spot. Only used if spot_borders is True.

Returns
None

Helper functions

spatialcorr.compute_local_correlation(adata, gene_1, gene_2, kernel_matrix=None, row_key='row', col_key='col', condition=None, bandwidth=5, contrib_thresh=10)

Calculate the correlation at each spot using Guassian kernel estimation for a pair of genes.

Parameters
adataAnnData

Spatial gene expression dataset with spatial coordinates stored in adata.obs.

gene_1string

The name or ID of the first gene.

gene_2string

The name or ID of the second gene.

kernel_matrixndarray

An NxN matrix, where N is the number of spots, storing the value of the Guassian kernel for each pair of spots.

row_keystring, optional (default

The name of the column in adata.obs storing the row coordinates of each spot.

col_keystring, optional (default

The name of the column in adata.obs storing the column coordinates of each spot.

conditionstring (default

The name of the column in adata.obs storing the histological region of each spot that should be conditioned on by the Gaussian kernel.

bandwidthint, optional (int

The kernel bandwidth used by the test.

contrib_threshinteger, optional (default

Threshold for the total weight of all samples contributing to the correlation estimate at each spot. Spots with total weight less than this value will be filtered prior to running the test (i.e., the effective-neighbors filter).

Returns
corrs: ndarray

An F-length array of correlation values storing the F spots kept after applying the effective-neighbors kernel.

keep_indsndarray

An F-lenght array of the indices of the original adata object that were kept after applying the effective-neighbors kernel. The values in corrs correspond to these spots.

spatialcorr.most_significant_pairs(additional)

Extract the most statistically significantly varying gene pairs from the results SpatialCorr run on a gene set.

Parameters
additional: dictionary

A dictionary storing the “additional” results from a SpatialCorr run on a gene set. Note, this dictionary must store the gene-pair test results (i.e., the results of the test run on each individual pair of genes within the gene set), which can be obtained by running spatialcorr.run_test, with the compute_gene_pair_pvals argument set to True.

Returns
df_top_pairs: DataFrame

A pandas DataFrame storing the gene-pairs ranked by their p-value under the SpatialCorr test.

spatialcorr.compute_kernel_matrix(df, bandwidth, region_key='cluster', condition_on_region=False, y_col='row', x_col='col', dist_matrix=None)

Compute the Gaussian kernel matrix between spots.

Parameters
df: DataFrame

A pandas DataFrame storing the coordinates of each spot.

bandwidth: float

The Gaussian kernel bandwidth parameter. Higher values increase the size of the kernel.

region_key: string, optional (default: ‘cluster’)

The column in df storing the region annotations for ensuring that the kernel conditions on regions/clusters. Only used if condition_on_region is True.

condition_on_region: boolean, optional (default: False)

If True, compute the kernel conditioned on regions stored in region_key.

y_col: string, optional (default: ‘row’)

The column in df storing the y-coordinates for each spot.

x_col: string, optional (default: ‘col’)

The column in `df’ storing the x-coordinates for each spot.

dist_matrix: ndarray, optional (default: None)

An NxN matrix storing the pairwise distances between spots to be used as input to the kernel. If None, Euclidean distances will be computed automatically.

Returns
kernel_matrix: ndarray

NxN array storing the pairwise weights between spots as computed by the Gaussian kernel.

spatialcorr.covariance_kernel_estimation(kernel_matrix, X)

Compute the kernel estimate of the covariance matrix at each spatial location.

Parameters
kernel_matrix: ndarray

NxN matrix representing the spatial kernel (i.e., pairwise weights between spatial locations)

X: ndarray

GxN expression matrix where G is number of genes and N is number of spots

Returns
all_covs: ndarray

NxGxG array storing the GxG covariance matrices at the N spots.

Datasets

spatialcorr.load_dataset(dataset_id)

Load a prepackaged spatial gene expression dataset.

Parameters
dataset_idstring, Options: {‘GSM4284326_P10_ST_rep2’}

The ID of the dataset to load.

Returns
adataAnnData

The spatial gene expression dataset. The rows and column coordinates are stored in adata.obs[‘row’] and adata.obs[‘col’] respectively. The clusters are stored in adata.obs[‘cluster’]. The gene expression matrix adata.X is in units of Dino normalized expression values.