Machine learning approaches to analyze histological images of tissues from radical prostatectomies

Article ID	Journal	Published Year	Pages	File Type
504011	Computerized Medical Imaging and Graphics	2015	12 Pages	PDF

Abstract

•Machine learning approaches were applied to separate stroma from epithelium in prostate tissue images.•Epithelium was sub-stratified into normal/benign and cancer areas.•Tissue content was predicted based on descriptors from individual pixels rather than from glands.•Tissue prediction does not involve detection of glandular lumens which is inaccurate, prone to errors, and has limitations.•Proposed method has the potential to aid in clinical prostate studies.

Computerized evaluation of histological preparations of prostate tissues involves identification of tissue components such as stroma (ST), benign/normal epithelium (BN) and prostate cancer (PCa). Image classification approaches have been developed to identify and classify glandular regions in digital images of prostate tissues; however their success has been limited by difficulties in cellular segmentation and tissue heterogeneity. We hypothesized that utilizing image pixels to generate intensity histograms of hematoxylin (H) and eosin (E) stains deconvoluted from H&E images numerically captures the architectural difference between glands and stroma. In addition, we postulated that joint histograms of local binary patterns and local variance (LBPxVAR) can be used as sensitive textural features to differentiate benign/normal tissue from cancer. Here we utilized a machine learning approach comprising of a support vector machine (SVM) followed by a random forest (RF) classifier to digitally stratify prostate tissue into ST, BN and PCa areas. Two pathologists manually annotated 210 images of low- and high-grade tumors from slides that were selected from 20 radical prostatectomies and digitized at high-resolution. The 210 images were split into the training (n = 19) and test (n = 191) sets. Local intensity histograms of H and E were used to train a SVM classifier to separate ST from epithelium (BN + PCa). The performance of SVM prediction was evaluated by measuring the accuracy of delineating epithelial areas. The Jaccard J = 59.5 ± 14.6 and Rand Ri = 62.0 ± 7.5 indices reported a significantly better prediction when compared to a reference method (Chen et al., Clinical Proteomics 2013, 10:18) based on the averaged values from the test set. To distinguish BN from PCa we trained a RF classifier with LBPxVAR and local intensity histograms and obtained separate performance values for BN and PCa: JBN = 35.2 ± 24.9, OBN = 49.6 ± 32, JPCa = 49.5 ± 18.5, OPCa = 72.7 ± 14.8 and Ri = 60.6 ± 7.6 in the test set. Our pixel-based classification does not rely on the detection of lumens, which is prone to errors and has limitations in high-grade cancers and has the potential to aid in clinical studies in which the quantification of tumor content is necessary to prognosticate the course of the disease. The image data set with ground truth annotation is available for public use to stimulate further research in this area.

Keywords

Image analysis Prostate cancer Tissue classification Machine learning