PurposeCells are building blocks for human physiology; consequently, understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions in both health and disease. Hematoxylin and eosin (H&E) is the standard stain used in histological analysis of tissues in both clinical and research settings. Although H&E is ubiquitous and reveals tissue microanatomy, the classification and mapping of cell subtypes often require the use of specialized stains. The recent CoNIC Challenge focused on artificial intelligence classification of six types of cells on colon H&E but was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We propose to use inter-modality learning to label previously un-labelable cell types on H&E.ApproachWe took advantage of the cell classification information inherent in multiplexed immunofluorescence (MxIF) histology to create cell-level annotations for 14 subclasses. Then, we performed style transfer on the MxIF to synthesize realistic virtual H&E. We assessed the efficacy of a supervised learning scheme using the virtual H&E and 14 subclass labels. We evaluated our model on virtual H&E and real H&E.ResultsOn virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of 0.34±0.15 (prevalence 0.03±0.01) and 0.47±0.1 (prevalence 0.07±0.02), respectively, when using ground truth centroid information. On real H&E, we needed to compute bounded metrics instead of direct metrics because our fine-grained virtual H&E predicted classes had to be matched to the closest available parent classes in the coarser labels from the real H&E dataset. For the real H&E, we could classify bounded metrics for the helper T cells and epithelial progenitors with upper bound positive predictive values of 0.43±0.03 (parent class prevalence 0.21) and 0.94±0.02 (parent class prevalence 0.49) when using ground truth centroid information.ConclusionsThis is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E.
Crohn’s disease (CD) is a chronic and relapsing inflammatory condition that affects segments of the gastrointestinal tract. CD activity is determined by histological findings, particularly the density of neutrophils observed on Hematoxylin and Eosin stains (H&E) imaging. However, understanding the broader morphometry and local cell arrangement beyond cell counting and tissue morphology remains challenging. To address this, we characterize six distinct cell types from H&E images and develop a novel approach for the local spatial signature of each cell. Specifically, we create a 10-cell neighborhood matrix, representing neighboring cell arrangements for each individual cell. Utilizing t-SNE for non-linear spatial projection in scatter-plot and Kernel Density Estimation contour-plot formats, our study examines patterns of differences in the cellular environment associated with the odds ratio of spatial patterns between active CD and control groups. This analysis is based on data collected at the two research institutes. The findings reveal heterogeneous nearest-neighbor patterns, signifying distinct tendencies of cell clustering, with a particular focus on the rectum region. These variations underscore the impact of data heterogeneity on cell spatial arrangements in CD patients. Moreover, the spatial distribution disparities between the two research sites highlight the significance of collaborative efforts among healthcare organizations. All research analysis pipeline tools are available at https://github.com/MASILab/cellNN.
Podocytes, specialized epithelial cells that envelop the glomerular capillaries, play a pivotal role in maintaining renal health. The current description and quantification of features on pathology slides are limited, prompting the need for innovative solutions to comprehensively assess diverse phenotypic attributes within Whole Slide Images (WSIs). In particular, understanding the morphological characteristics of podocytes, terminally differentiated glomerular epithelial cells, is crucial for studying glomerular injury. This paper introduces the Spatial Pathomics Toolkit (SPT) and applies it to podocyte pathomics. The SPT consists of three main components: (1) instance object segmentation, enabling precise identification of podocyte nuclei; (2) pathomics feature generation, extracting a comprehensive array of quantitative features from the identified nuclei; and (3) robust statistical analyses, facilitating a comprehensive exploration of spatial relationships between morphological and spatial transcriptomics features. The SPT successfully extracted and analyzed morphological and textural features from podocyte nuclei, revealing a multitude of podocyte morphomic features through statistical analysis. Additionally, we demonstrated the SPT’s ability to unravel spatial information inherent to podocyte distribution, shedding light on spatial patterns associated with glomerular injury. By disseminating the SPT, our goal is to provide the research community with a powerful and user-friendly resource that advances cellular spatial pathomics in renal pathology. The toolkit’s implementation and its complete source code are made openly accessible at the GitHub repository: https://github.com/hrlblab/spatial_pathomics.
Understanding the way cells communicate, co-locate, and interrelate is essential to understanding human physiology. Hematoxylin and eosin (H&E) staining is ubiquitously available both for clinical studies and research. The Colon Nucleus Identification and Classification (CoNIC) Challenge has recently innovated on robust artificial intelligence labeling of six cell types on H&E stains of the colon. However, this is a very small fraction of the number of potential cell classification types. Specifically, the CoNIC Challenge is unable to classify epithelial subtypes (progenitor, endocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), or connective subtypes (fibroblasts, stromal). In this paper, we propose to use inter-modality learning to label previously un-labelable cell types on virtual H&E. We leveraged multiplexed immunofluorescence (MxIF) histology imaging to identify 14 subclasses of cell types. We performed style transfer to synthesize virtual H&E from MxIF and transferred the higher density labels from MxIF to these virtual H&E images. We then evaluated the efficacy of learning in this approach. We identified helper T and progenitor nuclei with positive predictive values of 0.34 ± 0.15 (prevalence 0.03 ± 0.01) and 0.47 ± 0.1 (prevalence 0.07 ± 0.02) respectively on virtual H&E. This approach represents a promising step towards automating annotation in digital pathology.
PurposeDiffusion-weighted magnetic resonance imaging (DW-MRI) is a critical imaging method for capturing and modeling tissue microarchitecture at a millimeter scale. A common practice to model the measured DW-MRI signal is via fiber orientation distribution function (fODF). This function is the essential first step for the downstream tractography and connectivity analyses. With recent advantages in data sharing, large-scale multisite DW-MRI datasets are being made available for multisite studies. However, measurement variabilities (e.g., inter- and intrasite variability, hardware performance, and sequence design) are inevitable during the acquisition of DW-MRI. Most existing model-based methods [e.g., constrained spherical deconvolution (CSD)] and learning-based methods (e.g., deep learning) do not explicitly consider such variabilities in fODF modeling, which consequently leads to inferior performance on multisite and/or longitudinal diffusion studies.ApproachIn this paper, we propose a data-driven deep CSD method to explicitly constrain the scan–rescan variabilities for a more reproducible and robust estimation of brain microstructure from repeated DW-MRI scans. Specifically, the proposed method introduces a three-dimensional volumetric scanner-invariant regularization scheme during the fODF estimation. We study the Human Connectome Project (HCP) young adults test–retest group as well as the MASiVar dataset (with inter- and intrasite scan/rescan data). The Baltimore Longitudinal Study of Aging dataset is employed for external validation.ResultsFrom the experimental results, the proposed data-driven framework outperforms the existing benchmarks in repeated fODF estimation. By introducing the contrastive loss with scan/rescan data, the proposed method achieved a higher consistency while maintaining higher angular correlation coefficients with the CSD modeling. The proposed method is assessing the downstream connectivity analysis and shows increased performance in distinguishing subjects with different biomarkers.ConclusionWe propose a deep CSD method to explicitly reduce the scan–rescan variabilities, so as to model a more reproducible and robust brain microstructure from repeated DW-MRI scans. The plug-and-play design of the proposed approach is potentially applicable to a wider range of data harmonization problems in neuroimaging.
The Tangram algorithm is a benchmarking method of aligning single-cell data to various forms of spatial data collected from the same region. With this data alignment, the annotation of the single-cell data can be projected to spatial data. However, the cell composition of the single-cell data and spatial data might be different because of heterogeneous cell distribution. Whether the Tangram algorithm can be adapted when the two data have different cell-type ratios has not been discussed in previous works. In our practical application that maps the cell-type classification results of single-cell data to the Multiplex immunofluorescence spatial data, cell-type ratios were different. In this work, both simulation and empirical validation were conducted to quantitatively explore the impact of the mismatched cell-type ratio on the Tangram mapping in different situations. Results show that the cell-type difference has a negative influence on annotation mapping accuracy.
Multiplex immunofluorescence (MxIF) is an emerging imaging technology whose downstream molecular analytics highly rely upon the effectiveness of cell segmentation. In practice, multiple membrane markers (e.g., NaKATPase, PanCK and β-catenin) are employed to stain membranes for different cell types, so as to achieve a more comprehensive cell segmentation since no single marker fits all cell types. However, prevalent watershed-based image processing might yield inferior capability for modeling complicated relationships between markers. For example, some markers can be misleading due to questionable stain quality. In this paper, we propose a deep learning based membrane segmentation method to aggregate complementary information that is uniquely provided by large scale MxIF markers. We aim to segment tubular membrane structure in MxIF data using global (membrane markers z-stack projection image) and local (separate individual markers) information to maximize topology preservation with deep learning. Specifically, we investigate the feasibility of four SOTA 2D deep networks and four volumetric-based loss functions. We conducted a comprehensive ablation study to assess the sensitivity of the proposed method with various combinations of input channels. Beyond using adjusted rand index (ARI) as the evaluation metric, which was inspired by the clDice, we propose a novel volumetric metric that is specific for skeletal structure, denoted asclDiceSKEL. In total, 80 membrane MxIF images were manually traced for 5-fold cross-validation. Our model outperforms the baseline with a 20.2% and 41.3% increase in clDiceSKEL and ARI performance, which is significant (p<0.05) using the Wilcoxon signed rank test. Our work explores a promising direction for advancing MxIF imaging cell segmentation with deep learning membrane segmentation. Tools are available at https://github.com/MASILab/MxIF_Membrane_Segmentation.
Crohn’s disease (CD) is a debilitating inflammatory bowel disease with no known cure. Computational analysis of hematoxylin and eosin (H&E) stained colon biopsy whole slide images (WSIs) from CD patients provides the opportunity to discover unknown and complex relationships between tissue cellular features and disease severity. While there have been works using cell nuclei-derived features for predicting slide-level traits, this has not been performed on CD H&E WSIs for classifying normal tissue from CD patients vs active CD and assessing slide label-predictive performance while using both separate and combined information from pseudo-segmentation labels of nuclei from neutrophils, eosinophils, epithelial cells, lymphocytes, plasma cells, and connective cells. We used 413 WSIs of CD patient biopsies and calculated normalized histograms of nucleus density for the six cell classes for each WSI. We used a support vector machine to classify the truncated singular value decomposition representations of the normalized histograms as normal or active CD with four-fold cross-validation in rounds where nucleus types were first compared individually, the best was selected, and further types were added each round. We found that neutrophils were the most predictive individual nucleus type, with an AUC of 0.92 ± 0.0003 on the withheld test set. Adding information improved cross-validation performance for the first two rounds and on the withheld test set for the first three rounds, though performance metrics did not increase substantially beyond when neutrophils were used alone.
KEYWORDS: Diffusion, Voxels, Deep learning, Education and training, Data modeling, Spherical harmonics, Tolerancing, White matter, Spherical lenses, Reconstruction algorithms
Diffusion weighted magnetic resonance imaging (DW-MRI) captures tissue microarchitecture at a millimeter scale. With recent advantages in data sharing, large-scale multi-site DW-MRI datasets are being made available for multi-site studies. However, DW-MRI suffers from measurement variability (e.g., inter- and intra-site variability, hardware performance, and sequence design), which consequently yields inferior performance on multi-site and/or longitudinal diffusion studies. In this study, we propose a novel, deep learning-based method to harmonize DW-MRI signals for a more reproducible and robust estimation of microstructure. Our method introduces a data-driven scanner-invariant regularization scheme to model a more robust fiber orientation distribution function (FODF) estimation. We study the Human Connectome Project (HCP) young adults test-retest group as well as the MASiVar dataset (with inter- and intra-site scan/rescan data). The 8 th order spherical harmonics coefficients are employed as data representation. The results show that the proposed harmonization approach maintains higher angular correlation coefficients (ACC) with the ground truth signals (0.954 versus 0.942), while achieves higher consistency of FODF signals for intra-scanner data (0.891 versus 0.826), as compared with the baseline supervised deep learning scheme. Furthermore, the proposed data-driven framework is flexible and potentially applicable to a wider range of data harmonization problems in neuroimaging.
Eosinophilic esophagitis (EoE) is an immune-mediated, clinicopathologic disease of the esophagus. EoE is histologically characterized by the accretion of eosinophils in the esophageal epithelium. The current practice involving manual identification of the small-scale histologic features of EoE relative to the size of the esophageal biopsies can be burdensome and prone to interpreter errors. The existing automatic, computer-assisted EoE identification approaches are typically designed as a train-from-scratch setting, which is prone to overfitting. In this study, we propose to use transfer deep-learning via both the ImageNet pre-trained ResNet50 as well as the more recent Big Transfer (BiT) model to achieve automated EoE feature identification on whole slide images. As opposed to existing deep-learning-based approaches that typically focus on a single pathological phenotype, our study investigates five EoE-relevant histologic features including basal zone hyperplasia, dilated intercellular spaces, eosinophils, lamina propria fibrosis, and normal lamina propria simultaneously. From the results, the model achieved a promising testing balanced accuracy of 61.9%, which is better than that of its trained-from-scratch counterparts.
Multi-modal learning (e.g., integrating pathological images with genomic features) tends to improve the accuracy of cancer diagnosis and prognosis as compared to learning with a single modality. However, missing data is a common problem in clinical practice, i.e., not every patient has all modalities available. Most of the previous works directly discarded samples with missing modalities, which might lose information in these data and increase the likelihood of overfitting. In this work, we generalize the multi-modal learning in cancer diagnosis with the capacity of dealing with missing data using histological images and genomic data. Our integrated model can utilize all available data from patients with both complete and partial modalities. The experiments on the public TCGA-GBM and TCGA-LGG datasets show that the data with missing modalities can contribute to multi-modal learning, which improvesthe model performance in grade classification of glioma cancer.
Myxofibrosarcoma is a rare, malignant myxoid soft tissue tumor. It can be challenging to distinguish it from a benign myxoma in clinical practice as there exists imaging and histologic feature overlap between these two entities. Some previous works used radiomics features of T1-weighted images to differentiate myxoid tumors, but few have used multimodality data. In this project, we collect a dataset containing 20 myxomas and 20 myxofibrosarcomas, each with a T1- weighted image, a T2-weighted image, and clinical features. Radiomics features from multi-modality images and clinical features are used to train multiple machine learning models. Our experiment results show that the prediction accuracy using the multi-modality features surpasses the results from a single modality. The radiomics features Gray Level Variance, Gray Level Non-uniformity Normalized extracted from the Gray Level Run Length Matrix (GLRLM) of the T2 images, and age are the top three features selected by the least absolute shrinkage and selection operator (LASSO) feature reduction model
KEYWORDS: Data modeling, Performance modeling, Parallel computing, Image analysis, Instrument modeling, Process modeling, Pathology, Neural networks, Data processing, Skin cancer
Contrastive learning, a recent family of self-supervised learning, leverages pathological image analysis by learning from large-scale unannotated data. However, the state-of-the-art contrastive learning methods (e.g., SimCLR, BYOL) are typically limited by the more expensive computational hardware (with large GPU memory) as compared with traditional supervised learning approaches in achieving large training batch size. Fortunately, recent advances in the machine learning community provide multiple approaches to reduce GPU memory usage, such as (1) activation compressed training, (2) In-place activation, and (3) mixed precision training. Yet, such approaches are currently deployed independently without systematical assessments for contrastive learning. In this work, we applied these memory-efficient approaches into a self-supervised framework. The contribution of this paper is three-fold: (1) We combined previously independent GPU memory-efficient methods with self-supervised learning framework; (2) Our experiments are to maximize the memory efficiency via limited computational resources (a single GPU); (3) The self-supervised learning framework with GPU memory-efficient method allows a single GPU to triple the batch size that typically requires three GPUs. From the experimental results, contrastive learning model with larger batch size leads to higher accuracy enabled by GPU memory-efficient method on single GPU.
Deep brain stimulation (DBS) has been recently approved by the FDA to treat epilepsy patients with refractory seizures, i.e., patients for whom medications are not effective. It involves stimulating the anterior nucleus of the thalamus (ANT) with electric impulses using permanently placed electrodes. One main challenge with the procedure is to determine a trajectory to place the implant at the proper location while avoiding sensitive structures. In this work, we focus on one category of sensitive structures, i.e., brain vessels, and we propose a method to segment them in clinically acquired contrast-enhanced T1-weighted (T1CE) MRI images. We develop a deep-learning-based 3D U-Net model that we train/test on a set of images for which we have created the ground truth. We compare this approach to a traditional vesselness-based technique and we show that our method produces significantly better results (Dice score: 0.794), especially for small vessels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.