Deep learning-based methods have achieved significant improvement in accuracy in diagnosing lung diseases utilizing Chest X-Ray. However, their black-box nature and lack of interpretability reduce the confidence among physicians in the reliability of machine-generated decisions, which consistently limits their application in clinical practice. In this paper, we propose a novel interpretable deep learning model VProtoNet, which can produce heatmaps that display important diagnostic image features of lung diseases and reveal how the model makes decision based on them. VProtoNet generates heatmaps by comparing the features extracted by Vision Transformer with the prototypes, each of which signifies a typical part of a Chest X-ray image, learned within the model. Further, we simplify the heatmap into a single similarity score that can be used as the basis for model classification diagnosis. To verify the effectiveness of our model, we applied our method to Chest X-ray 14 dataset and achieved an accuracy of 72.35%. Also, we analyzed the feature maps generated by our model during the classification process, discovering that they indeed intuitively demonstrate the model's recognition and understanding of the diseased areas, which enables physicians to better comprehend the model's decision-making process.
Most existing super-resolution (SR) methods are usually achieved through fully supervised means with massive training samples, and assume the degradation of low-resolution images corresponding to high-resolution images is fixed. However, the degradation process of real-world images is often more complex. Therefore, these models often perform poorly when dealing with low-resolution (LR) images with unknown degradation. In addition, runtime is also an important factor in deploying image super-resolution models, especially on devices with limited resources. However, the runtime of existing zero-shot blind super-resolution models is not ideal. In this paper, we propose an efficient zero-shot network (EZSN) for blind super-resolution. Specifically, we propose a high-frequency feature extraction block (HFFEB), which speeds up network inference by stacking highly optimized convolution and activation layers and reducing the use of feature fusion. In addition, we also propose an enhanced residual block (ERB) for extracting more features to improve model performance. The experimental results indicate that our proposed EZSN method has a significant advantage in terms of runtime compared to previous ZSSR methods when dealing with benchmark datasets that have unknown degradation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.