Archives

Romanian Journal of Information Technology and Automatic Control / Vol. 34, No. 2, 2024


Harnessing the power of vision transformers for enhanced OCT image classification

Elena-Anca PARASCHIV, Alina-Elena SULTANA

Abstract:

The rising prevalence of eye disorders has raised concerns, emphasizing the need to accelerate the detection of retinal diseases. Early and accurate classification of these conditions is crucial for timely diagnosis and effective treatment in order to address critical situations. The recent advancements in retinal imaging have enhanced the diagnosis and management of Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME) or Drusen and the deep learning-based applications on Optical Coherence Tomography (OCT) images have further revolutionized the field by enabling automated, precise, and efficient disease classification, paving the way for earlier interventions and improved patient outcomes. This study investigates the use of Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) for automated retinal disease classification. Three models were implemented: ViT, DeepViT, and a hybrid model combining ResNet50 with ViT, trained and evaluated on a publicly available OCT dataset. The hybrid model achieved the highest accuracy of 99.97%, thanks to its ability to capture both local and global features. This study underscores the potential of ViTs in medical image analysis and their integration with CNNs to develop accurate, robust, and scalable diagnostic tools, showing great promise for clinical applications.

Keywords:
Vision Transformers (ViTs), OCT, Image Classification, Convolutional Neural Networks (CNNs), retina.

View full article:

CITE THIS PAPER AS:
Elena-Anca PARASCHIV, Alina-Elena SULTANA, "Harnessing the power of vision transformers for enhanced OCT image classification", Romanian Journal of Information Technology and Automatic Control, ISSN 1220-1758, vol. 34(2), pp. 97-111, 2024. https://doi.org/10.33436/v34i2y202408