Towards Large-Scale Histopathological Image Analysis: Hashing-Based Image Retrieval
Automatic analysis of histopathological images has been widely utilized leveraging computational image-processing methods and modern machine learning techniques. Both computer- aided diagnosis (CAD) and content-based image-retrieval (CBIR) systems have been successfully developed for diagnosis, disease detection, and decision support in this area. Large-scale and data-driven methods have emerged to offer a promise of bridging the semantic gap between images and diagnostic information. Scalable image-retrieval technique is developed to cope intelligently with massive histopathological images. Specifically, a supervised kernel hashing technique is proposed, which leverages a small amount of supervised information in learning to compress a 10 000-dimensional image feature vector into only tens of binary bits with the informative signatures preserved. These binary codes are then indexed into a hash table that enables real-time retrieval of images in a large database. Critically, the supervised information is employed to bridge the semantic gap between low-level image features and high-level diagnostic information. We build a scalable image-retrieval framework based on the supervised hashing technique and validate its performance on several thousand histopathological images acquired from breast microscopic tissues.