Ultra-Resolution Spatial Transcriptomics by Stereo-seq Reconstructs Tumor Microenvironment Architecture and Spatially Variable Gene Co-Expression Networks in Human Colorectal Carcinoma
Keywords:
Stereo-seq, spatial transcriptomics, tumor microenvironment, colorectal carcinoma, spatially variable genes, DNA nanoball arrays, STAGATE, BANKSY, GraphST, nnSVG, SPARK-X, SpatialDE, Cell2location, StereoSiTE, CellChat, SpaGRN, gene regulatory networks, Leiden clustering, Harmony integration, non-negative Tucker decomposition, epithelial–mesenchymal transition, single-cell resolution, spatial domain segmentation, ligand–receptor interaction, Gaussian process modeling, WGCNA, tertiary lymphoid structures, invasive tumor margin, transcription factor regulons, large-scale bioinformaticsAbstract
Background: Understanding the spatial organization of gene expression within the tumor microenvironment (TME) is essential for elucidating the molecular mechanisms that govern tumor progression, immune evasion, and therapeutic resistance. Conventional spatial transcriptomics platforms are constrained by multicellular resolution and limited field-of-view, which preclude the simultaneous capture of single-cell and subcellular transcriptomic dynamics across intact tissue sections. These technical limitations have impaired the accurate delineation of spatially restricted gene expression programs and the reconstruction of spatially resolved intercellular signaling networks, particularly in architecturally complex tumors such as colorectal carcinoma (CRC). A spatially resolved, high-resolution transcriptomic framework that integrates state-of-the-art computational algorithms is therefore urgently needed to comprehensively map the cellular and molecular landscape of the TME at biologically relevant scales.
Methods: We employed Stereo-seq (Spatial Enhanced Resolution Omics-sequencing), a next-generation spatial transcriptomics platform based on DNA nanoball (DNB)-patterned arrays with 500 nm capture spot spacing, to profile freshly frozen and formalin-fixed paraffin-embedded (FFPE) human colorectal carcinoma tissue sections obtained from eighteen patients spanning TNM stages I through IV. Raw sequencing outputs were processed using the SAW (Stereo-seq Analysis Workflow) pipeline for barcode demultiplexing, genomic alignment, and spatially indexed expression matrix construction. Single-cell boundary delineation was performed with StereoCell, a convolutional deep neural network model integrating DAPI nuclear staining co-registered with transcriptional signal arrays. Spatial domain segmentation was conducted by applying three independent graph-based algorithms — STAGATE (Spatial domain identification via an Adaptive Graph Attention auto-Encoder), BANKSY (Building Aggregates with a Neighborhood Kernel and Zonal Statistics), and GraphST — followed by consensus clustering using the Leiden algorithm with silhouette score-guided resolution optimization. Spatially variable gene (SVG) identification was performed through an ensemble framework comprising nnSVG (nearest-neighbor Gaussian process modeling), SPARK-X (non-parametric covariance-based large-matrix testing), SpatialDE (Gaussian process regression with automatic relevance determination), and VISGP (variational sparse Gaussian process inference), with significance defined at a Benjamini–Hochberg-adjusted p-value of less than 0.01 and a Moran's I spatial autocorrelation coefficient exceeding 0.30. Cell-type compositional deconvolution was accomplished using Cell2location integrated against a CRC-specific single-cell RNA-sequencing reference atlas. Spatial cell–cell communication was inferred using StereoSiTE, CellChat v2, and the graph attention network-based model CellNEST, enabling directional quantification of ligand–receptor interaction intensity across spatially defined cellular neighborhoods. Gene regulatory network reconstruction was performed using SpaGRN and CLARIFY, incorporating transcription factor binding motif databases from JASPAR and ENCODE. Multi-patient spatial expression tensors were decomposed via non-negative Tucker tensor factorization to extract consensus gene expression programs, with inter-sample batch effects corrected using the Harmony algorithm. Spatially informed weighted gene co-expression network analysis (WGCNA) was adapted for spatial data structures and applied to resolved SVG sets to identify co-regulated gene modules across tissue domains.
Results: Stereo-seq profiling yielded a mean of 3,847 ± 412 detected genes per single-cell bin across all tissue sections, representing a 4.2-fold increase in transcriptomic depth compared to conventional array-based spatial platforms. Consensus spatial domain segmentation identified an average of seven transcriptionally and spatially distinct tissue domains per section, encompassing the tumor epithelial core, the invasive tumor margin, cancer-associated fibroblast stroma, vascular endothelial zones, two immunologically divergent T-cell and myeloid infiltration layers, and tertiary lymphoid structure niches. Ensemble SVG analysis identified 2,841 high-confidence spatially variable genes, which were organized into fourteen spatially coherent co-expression modules by spatial WGCNA. Of these, Module M7 (comprising 187 genes) was selectively activated at the invasive tumor margin and was significantly enriched for epithelial–mesenchymal transition regulators including CDH2, VIM, ZEB1, and SNAI2, as well as Wnt/β-catenin signaling components, implicating this domain as a transcriptionally primed invasive niche. Spatial cell–cell communication analysis revealed that TGF-β1 → TGFBR1/TGFBR2 and CXCL12 → CXCR4 signaling axes constituted the dominant spatially constrained ligand–receptor interactions between cancer-associated fibroblasts and tumor epithelial cells, with interaction intensity significantly elevated at the tumor–stroma interface relative to the tumor core (Wilcoxon rank-sum test, p < 0.001). Gene regulatory network reconstruction identified TWIST1, FOSL1, and NR2F2 as master transcriptional regulators of the spatially defined invasive margin gene program, with each exhibiting significantly higher regulon activity scores in margin-annotated cell bins compared to core tumor cells. Non-negative Tucker decomposition across eighteen patient tensors further resolved four recurrent cross-patient spatial gene expression programs, two of which were significantly associated with disease stage and patient survival outcome in independent validation cohorts.
Conclusions: This study presents a high-resolution, computationally comprehensive spatial transcriptomic atlas of the human colorectal carcinoma tumor microenvironment, generated at unprecedented subcellular resolution through Stereo-seq. The integrated bioinformatics framework — encompassing spatial domain detection, ensemble spatially variable gene identification, cell-type deconvolution, cell–cell communication inference, and spatially resolved gene regulatory network reconstruction — provides a rigorously validated and fully reproducible analytical pipeline for large-scale spatial omics studies. The spatially restricted transcriptional programs, ligand–receptor axes, and master regulatory factors identified herein represent biologically coherent and clinically actionable molecular features of colorectal cancer invasion and immune microenvironment organization, establishing a robust foundation for future spatially targeted therapeutic strategies and multi-tissue spatial omics atlas construction.
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.