Analyzing and visualizing cell-cell communication events#

In this tutorial, we demonstrate how HoloNet can be used to analyze and visualize cell-cell communication in spatial transcriptomics data.

HoloNet needs three inputs:

  1. Spatial transcriptomic data (with gene expression matrix and spatial information).
    • In this version of HoloNet, we support the spatial data based on AnnData loaded from Scanpy.

  2. Cell-type information.
    • Cell-type percentages from deconvolution methods prefer to be saved in adata.obsm['predicted_cell_type'].

    • Categorical cell-type labels prefer to be saved in adata.obs['cell_type'].

  3. Database with pairwise ligand and receptor genes.
    • A pandas dataframe, must contain two columns: ‘Ligand_gene_symbol’ and ‘Receptor_gene_symbol’.

Note

The tutorial mainly follows these steps:

  1. Load spatial transcriptomic data.

  2. Construct multi-view communication event (CE) network among single cells (each LR pair corresponds to one view).

  3. Visualize communication based on the multi-view network.

  4. Other analysis methods (such as clustering LR pairs).

import HoloNet as hn

import os
import pandas as pd
import numpy as np
import scanpy as sc
import matplotlib.pyplot as plt
import torch

import warnings
warnings.filterwarnings('ignore')
hn.set_figure_params(tex_fonts=False)
sc.settings.figdir = './figures/'

Data loading#

Loading example spatial transcriptomic data#

We prepare a example breast cancer dataset for users from the 10x Genomics website. We preprocessed the dataset, including filtering, normalization and cell-type annotation. Users can load the example dataset via HoloNet.preprocessing.load_brca_visium_10x().

adata = hn.pp.load_brca_visium_10x()

Visualize the cell-type percentages in each spot.

hn.pl.plot_cell_type_proportion(adata, plot_cell_type='stroma')
../_images/tutorial_CE_3_0.png

The figures outputed from HoloNet can be saved if adding fname parameter.

Note

The parameters of plotting functions in this tutorials are mainly inherited from two base plotting functions:

The cell-type label of each spot (the cell-type with maximum percentage in the spot)

sc.pl.spatial(adata, color=['cell_type'], size=1.4, alpha=0.7,
             palette=hn.brca_default_color_celltype)
../_images/tutorial_CE_2_0.png

Loading ligand-receptor database#

We provide a database with pairwise ligand and receptor genes for users. Load the database and filter the LR pairs, requiring both ligand and receptor genes to be expressed in a certain percentage of cells (or spots).

interaction_db, cofactor_db, complex_db = hn.pp.load_lr_df(human_or_mouse='human')
expressed_lr_df = hn.pp.get_expressed_lr_df(interaction_db, complex_db, adata)
expressed_lr_df.shape
(325, 12)

We also allow access to ligand-receptor pair databases based on the MITAB canonicalized nomenclature. We provide samples here so that users can import MITAB-compliant ligand-receptor databases and replace interaction_db

pd.set_option('display.max_columns', None)
interaction_db_MITAB = pd.read_csv('cellchatdb_human_interaction_example_with_MITAB.csv', index_col=0)
interaction_db_MITAB.head()
interaction_name pathway_name ligand receptor agonist antagonist co_A_receptor co_I_receptor evidence annotation interaction_name_2 LR_pair #ID Interactor A ID Interactor B Alt IDs Interactor A Alt IDs Interactor B Aliases Interactor A Aliases Interactor B Interaction Detection Method Publication 1st Author Publication Identifiers Taxid Interactor A Taxid Interactor B Interaction Types Source Database Interaction Identifiers Confidence Values
0 TGFA_EGFR EGF TGFA EGFR NaN NaN NaN NaN KEGG: hsa04012 Secreted Signaling TGFA - EGFR TGFA:EGFR entrez gene/locuslink:1956 entrez gene/locuslink:7039 biogrid:108276|entrez gene/locuslink:EGFR|unip... biogrid:112897|entrez gene/locuslink:TGFA|unip... entrez gene/locuslink:ERBB(gene name synonym)|... entrez gene/locuslink:TFGA(gene name synonym) psi-mi:"MI:0018"(two hybrid) Kotlyar M (2015) pubmed:25402006 taxid:9606 taxid:9606 psi-mi:"MI:0407"(direct interaction) psi-mi:"MI:0463"(biogrid) biogrid:1067440 -
1 OSM_OSMR_IL6ST OSM OSM OSMR_IL6ST NaN NaN NaN NaN KEGG: hsa04060 Secreted Signaling OSM - (OSMR+IL6ST) OSM:OSMR_IL6ST entrez gene/locuslink:5008 entrez gene/locuslink:3572 biogrid:111049|entrez gene/locuslink:OSM|unipr... biogrid:109786|entrez gene/locuslink:IL6ST|uni... - entrez gene/locuslink:CD130(gene name synonym)... psi-mi:"MI:0004"(affinity chromatography techn... Huttlin EL (2015) pubmed:26186194 taxid:9606 taxid:9606 psi-mi:"MI:0915"(physical association) psi-mi:"MI:0463"(biogrid) biogrid:1176790 score:0.999999936
2 TNFSF9_TNFRSF9 CD137 TNFSF9 TNFRSF9 NaN NaN NaN NaN KEGG: hsa04060 Secreted Signaling TNFSF9 - TNFRSF9 TNFSF9:TNFRSF9 entrez gene/locuslink:3604 entrez gene/locuslink:8744 biogrid:109817|entrez gene/locuslink:TNFRSF9|u... biogrid:114281|entrez gene/locuslink:TNFSF9|un... entrez gene/locuslink:4-1BB(gene name synonym)... entrez gene/locuslink:4-1BB-L(gene name synony... psi-mi:"MI:0004"(affinity chromatography techn... Huttlin EL (2015) pubmed:26186194 taxid:9606 taxid:9606 psi-mi:"MI:0915"(physical association) psi-mi:"MI:0463"(biogrid) biogrid:1177155 score:0.992349156
3 EFNB2_EPHB2 EPHB EFNB2 EPHB2 NaN NaN NaN NaN PMID: 15114347 Cell-Cell Contact EFNB2 - EPHB2 EFNB2:EPHB2 entrez gene/locuslink:1948 entrez gene/locuslink:2048 biogrid:108268|entrez gene/locuslink:EFNB2|ent... biogrid:108362|entrez gene/locuslink:EPHB2|uni... entrez gene/locuslink:EPLG5(gene name synonym)... entrez gene/locuslink:CAPB(gene name synonym)|... psi-mi:"MI:0004"(affinity chromatography techn... Huttlin EL (2015) pubmed:26186194 taxid:9606 taxid:9606 psi-mi:"MI:0915"(physical association) psi-mi:"MI:0463"(biogrid) biogrid:1177983 score:0.999999507
4 EFNB2_EPHB3 EPHB EFNB2 EPHB3 NaN NaN NaN NaN PMID: 15114347 Cell-Cell Contact EFNB2 - EPHB3 EFNB2:EPHB3 entrez gene/locuslink:1948 entrez gene/locuslink:2049 biogrid:108268|entrez gene/locuslink:EFNB2|ent... biogrid:108363|entrez gene/locuslink:EPHB3|uni... entrez gene/locuslink:EPLG5(gene name synonym)... entrez gene/locuslink:ETK2(gene name synonym)|... psi-mi:"MI:0004"(affinity chromatography techn... Huttlin EL (2015) pubmed:26186194 taxid:9606 taxid:9606 psi-mi:"MI:0915"(physical association) psi-mi:"MI:0463"(biogrid) biogrid:1177984 score:0.999990606
expressed_lr_df = hn.pp.get_expressed_lr_df(interaction_db_MITAB, complex_db, adata)

Constructing multi-view CE network#

Ligand molecules from a single source can only cover a certain region.

Before constructing multi-view communication network, we need to calculate the w_best to decide the region (‘how far is far’).

The parameters for culcalating w_best is shown in HoloNet.tools.default_w_visium().

w_best = hn.tl.default_w_visium(adata)
hn.pl.select_w(adata, w_best=w_best)
../_images/tutorial_CE_5_0.png

Based on w_best, we can build up the multi-view communication network.

We calculate the edge weights of the multi-view communication network. With the more attributions of ligands, HoloNet.tools.compute_ce_tensor() can set different w_best for secreted ligands and plasma-membrane-binding ligands.

Then we filter the edges with low specificities.

elements_expr_df_dict = hn.tl.elements_expr_df_calculate(expressed_lr_df, complex_db, cofactor_db, adata)
ce_tensor = hn.tl.compute_ce_tensor(expressed_lr_df, w_best, elements_expr_df_dict, adata)
filtered_ce_tensor = hn.tl.filter_ce_tensor(ce_tensor, adata, expressed_lr_df, elements_expr_df_dict, w_best)

Note

This step will consume a lot of memory. If you run out of memory, you can choose to only compute communication networks with fewer ligand-receptor pairs, and turn down the value of n_pairs parameter in HoloNet.tools.filter_ce_tensor() Also, you can set copy=False in hn.tl.filter_ce_tensor or down-sample the dataset to reduced memory consumption. When setting cut_porp as 2, we can run the whole tutorial on Macbook Pro (16G RAM, Apple M1 pro).

import random
cut_porp = 2
adata_subset = adata[random.sample(list(adata.obs_names), round(adata.shape[0] / cut_porp))]
elements_expr_df_dict = hn.tl.elements_expr_df_calculate(expressed_lr_df, complex_db, cofactor_db, adata_subset)
ce_tensor = hn.tl.compute_ce_tensor(expressed_lr_df, w_best, elements_expr_df_dict, adata_subset)
filtered_ce_tensor = hn.tl.filter_ce_tensor(ce_tensor, adata_subset, expressed_lr_df,
                                            elements_expr_df_dict, w_best, copy=False)

Visualizing CEs#

Based on the multi-view CE network, we provide two visualization methods:

  • CE hotspot plots for visualizing the centralities of spots. Provide two calculating methods:
    • Degree centrality: out-degree + in-degree, faithfully reflects the CE strength related to each spot.

    • Eigenvector centrality: reflects the core regions with active communication.

  • Cell-type-level CE network:
    • CE strengths among cell-types

CEs hotspot plots#

Degree centrality of each spot in the TGFB1:(TGFBR1+TGFBR2) CE network. Reflecting regions with active TGFB1:(TGFBR1+TGFBR2) communication.

hn.pl.ce_hotspot_plot(filtered_ce_tensor, adata,
                      lr_df=expressed_lr_df, plot_lr='TGFB1:(TGFBR1+TGFBR2)')
../_images/tutorial_CE_8_0.png

Hotspot plot based on eigenvector centrality. This plot better detects a clear center than the one based on degree centrality.

hn.pl.ce_hotspot_plot(filtered_ce_tensor, adata,
                      lr_df=expressed_lr_df, plot_lr='TGFB1:(TGFBR1+TGFBR2)',
                      centrality_measure='eigenvector')
../_images/tutorial_CE_9_1.png

Cell-type-level CE network#

Loading the cell-type percentage of each spot.

cell_type_mat, \
cell_type_names = hn.pr.get_continuous_cell_type_tensor(adata, continuous_cell_type_slot = 'predicted_cell_type',)

Plotting the cell-type-level CE network. The thickness of the edge represents the strength of TGFB1:(TGFBR1+TGFBR2) communication between the two cell types.

_ = hn.pl.ce_cell_type_network_plot(filtered_ce_tensor, cell_type_mat, cell_type_names,
                                    lr_df=expressed_lr_df, plot_lr='TGFB1:(TGFBR1+TGFBR2)', edge_thres=0.2,
                                    palette=hn.brca_default_color_celltype)
../_images/tutorial_CE_10_0.png

LR pair clustering#

Agglomerative Clustering the ligand-receptor pairs based on the centrality of each spot. The cluster label of each ligand-receptor pair saved in clustered_expressed_LR_df['cluster']. The number of clusters can be selected using n_clusters parameter in HoloNet.tools.cluster_lr_based_on_ce().

cell_cci_centrality = hn.tl.compute_ce_network_eigenvector_centrality(filtered_ce_tensor)
clustered_expressed_LR_df,_ = hn.tl.cluster_lr_based_on_ce(filtered_ce_tensor, adata, expressed_lr_df,
                                                         w_best=w_best, cell_cci_centrality=cell_cci_centrality)

Dendrogram for hierarchically clustering all ligand–receptor pairs.

hn.pl.lr_clustering_dendrogram(_, expressed_lr_df, ['TGFB1:(TGFBR1+TGFBR2)'],
                               dflt_col = '#333333',)
../_images/tutorial_CE_12_0.png

General CE hotspot of each ligand-receptor cluster (superimposing All CE hotspots of members in a cluster).

hn.pl.lr_cluster_ce_hotspot_plot(lr_df=clustered_expressed_LR_df,
                                 cell_cci_centrality=cell_cci_centrality,
                                 adata=adata)
../_images/tutorial_CE_13_0.png ../_images/tutorial_CE_13_1.png ../_images/tutorial_CE_13_2.png ../_images/tutorial_CE_13_3.png