Skip to main navigation Skip to search Skip to main content

IEViT: an enhanced vision transformer architecture for chest X-ray image classification

  • Gabriel Iluebe Okolo
  • , Stamos Katsigiannis*
  • , Naeem Ramzan
  • *Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    116 Downloads (Pure)

    Abstract

    Background and Objective: Chest X-ray imaging is a relatively cheap and accessible diagnostic tool that can assist in the diagnosis of various conditions, including pneumonia, tuberculosis, COVID-19, and others. However, the requirement for expert radiologists to view and interpret chest X-ray images can be a bottleneck, especially in remote and deprived areas. Recent advances in machine learning have made possible the automated diagnosis of chest X-ray scans. In this work, we examine the use of a novel Transformer-based deep learning model for the task of chest X-ray image classification. 

    Methods: We first examine the performance of the Vision Transformer (ViT) state-of-the-art image classification machine learning model for the task of chest X-ray image classification, and then propose and evaluate the Input Enhanced Vision Transformer (IEViT), a novel enhanced Vision Transformer model that can achieve improved performance on chest X-ray images associated with various pathologies. 

    Results: Experiments on four chest X-ray image data sets containing various pathologies (tuberculosis, pneumonia, COVID-19) demonstrated that the proposed IEViT model outperformed ViT for all the data sets and variants examined, achieving an F1-score between 96.39% and 100%, and an improvement over ViT of up to +5.82% in terms of F1-score across the four examined data sets. IEViT's maximum sensitivity (recall) ranged between 93.50% and 100% across the four data sets, with an improvement over ViT of up to +3%, whereas IEViT's maximum precision ranged between 97.96% and 100% across the four data sets, with an improvement over ViT of up to +6.41%. 

    Conclusions: Results showed that the proposed IEViT model outperformed all ViT's variants for all the examined chest X-ray image data sets, demonstrating its superiority and generalisation ability. Given the relatively low cost and the widespread accessibility of chest X-ray imaging, the use of the proposed IEViT model can potentially offer a powerful, but relatively cheap and accessible method for assisting diagnosis using chest X-ray images.

    Original languageEnglish
    Article number107141
    Number of pages11
    JournalComputer Methods and Programs in Biomedicine
    Volume226
    Early online date16 Sept 2022
    DOIs
    Publication statusPublished - 30 Nov 2022

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 3 - Good Health and Well-being
      SDG 3 Good Health and Well-being

    Keywords

    • chest radiography
    • deep learning
    • image classification
    • vision transformer
    • x-Rays

    Fingerprint

    Dive into the research topics of 'IEViT: an enhanced vision transformer architecture for chest X-ray image classification'. Together they form a unique fingerprint.

    Cite this