Robust Single-cell Matching and Multi-modal Analysis Using Shared and Distinct Features Reveals Orchestrated Immune Responses

Zhu BK*, Chen S*, Bai Y, Chen H, Mukherjee N, Vazquez G, McIlwain DR, Tzankov A, Lee IT, Matter MS, Golstev Y, Ma Z#, Nolan GP#, Jiang S#.


The ability to align individual cellular information from multiple experimental sources, techniques and systems is fundamental for a true systems-level understanding of biological processes. While single-cell transcriptomic studies have transformed our appreciation for the complexities and contributions of diverse cell types to disease, they can be limited in their ability to assess protein-level phenotypic information and beyond. Therefore, matching and integrating single-cell datasets which utilize robust protein measurements across multiple modalities is critical for a deeper understanding of cell states, and signaling pathways particularly within their native tissue context. Current available tools are mainly designed for single-cell transcriptomics matching and integration, and generally rely upon a large number of shared features across datasets for mutual Nearest Neighbor (mNN) matching. This approach is unsuitable when applied to single-cell proteomic datasets, due to the limited number of parameters simultaneously accessed, and lack of shared markers across these experiments. Here, we introduce a novel cell matching algorithm, Matching with pARtIal Overlap (MARIO), that takes into account both shared and distinct features, while consisting of vital filtering steps to avoid sub-optimal matching. MARIO accurately matches and integrates data from different single-cell proteomic and multi-modal methods, including spatial techniques, and has cross-species capabilities. MARIO robustly matched tissue macrophages identified from COVID-19 lung autopsies via CODEX imaging to macrophages recovered from COVID-19 bronchoalveolar lavage fluid via CITE-seq. This cross-platform integrative analysis enabled the identification of unique orchestrated immune responses within the lung of complement-expressing macrophages and their impact on the local tissue microenvironment. MARIO thus provides an analytical framework for unified analysis of single-cell data for a comprehensive understanding of the underlying biological system.