Digital retinal images are widely used in diagnosis and treatment of eye disorders such as glaucoma, diabetic retinopathy and age-related macular degeneration. In most cases, this requires image registration which is the process of aligning two or more images taken at different times, from different viewpoints and by different sensors. For this purpose, the feature correspondences are first established between two images and then a geometric alignment is performed using those correspondences. Accurate and real-time retinal image registration (RIR) is still a challenging problem in presence of high resolution, small overlapped, large morphological and intensity changes, low contrast and unhealthy retinal images. Recently, several effective state-of-art approaches based on famous SIFT algorithm were introduced for RIR. However, these algorithms failed to register retinal images when several mentioned challenging situations are combined together. In addition, they have high computational cost.
Motivated by the need for efficient RIR, the main contribution of this research is the development of automatic frameworks that tackle in a general and principled way the problems arising in the construction of multimodal, temporal and different field of view RIR systems. Specifically, the core of proposed RIR frameworks is an approach to detect feature locations that are suitable to establish correspondences in presence of high resolution multimodal and unimodal retinal images. Inspired by the superior performance of the famous SIFT algorithm, this work extends much previous work on detecting feature locations for RIR. Robust integrated outlier rejection methods are also presented to remove large quantities of mismatches due to repetitive patterns and complex characteristics of retinal images. On the whole, the minor contributions at each stage of proposed RIR frameworks are significant to make successful the registration process. Regardless of point feature locations for RIR, the attention of human visual system directs to objects instead of points in overlapped areas to find matches. Thus, to model human viewing behavior, the scene of retina is segmented into regions which can be used for feature matching. In comparison to point-based approaches, the method is more fast, accurate and simple in implementation.
The datasets used in this research include high resolution temporal and multimodal images with large content and intensity changes, and small overlapped image pairs with low contrast and texture in overlapped areas. Experimental results, quantitatively and qualitatively, on several retinal image datasets with different resolutions, wide variety of diseases, and different image modalities show the outperformance of proposed approaches of this research over similar methods in terms of efficiency, positional accuracy and speed.