Yaw Afriyie (PhD), a lecturer at the Faculty of Information and Communication Technology (FICT) of the SD Dombo University of Business and Integrated Development Studies Wa, has contributed to the improvement of computer-aided Diagnosis of Diseases using Deep Learning Technique.
His research findings were presented at the University of Energy and Natural Resources as part of his PhD defense held on November 2, 2023 at the French Multimedia Lab.
Yaw Afriyie in his presentation of the research findings stated that Machine learning was a vast area of research with applications in various fields, including computer vision, bioinformatics, natural language processing, and many others. Convolutional neural networks (CNNs) are among the most effective deep learning methodologies. Despite being successful, CNNs suffer from a few limitations, including invariance from pooling and the inability to understand spatial relationships. As a result, capsule networks (CapsNets) can potentially reduce these issues by storing and routing pose information associated with extracted features through their architectures, seeking agreement between lower-level predictions at each layer and those at the higher level. While CapsNets have fundamental advantages over CNNs, they also have drawbacks preventing widespread adoption. The memory bottleneck caused by vector-valued activations combined with the iterative nature of capsule routing algorithms results in inefficient models. This is especially true when working with complex images like CIFAR 10 or images with varying backgrounds. This problem results from the tendency to extract features from all areas of the input image.
Additionally, they are susceptible to underfitting or overfitting if the number of routing iterations is not properly set, and the lack of training data has compounded this problem. Deep learning models are still considered “black boxes,” requiring more explanation before practical adoption to earn the industry’s trust. Thus, CapsNets solution may not be adopted in critical application areas such as health, agriculture, and so on unless users can interpret and explain the operational principles behind the models. To address these problems in CapsNets, this thesis compares the performances of Sabour’s model (i.e. the baseline model) and state-of-the-art models on complex datasets such as CIFAR-10.
Further, this thesis proposes using a customized squash function that is sensitive to small changes within short intervals, thereby improving feature extraction by extracting only relevant textural features from input images and increasing convergence and classification accuracy. By implementing max-pooling in the optimized architecture, fewer parameters are trainable, and the model is more generalized to complex images such as blood cells. Consequently, shallow CapsNets can outperform deep multi-lane models on complex images. A denoising technique is implemented in the CapsNets layers to suppress noise in complex images to create more relevant features. Additionally, the separability of clusters through visualizations formed at the class capsule layer is used to evaluate the quality of the routing process in this thesis. This thesis contributes to demystifying the “black box” concept and providing some transparency to how models operate. As a result, small models can outperform deep models in computer vision research and converge within a few epochs when the right feature extractors are used. This study presents significant algorithmic advancements in capsule network applications, particularly for computer-aided diagnosis of biomedical images. In the future, capsule networks will play an increasingly important role in deep learning applications, and the contributions made in this thesis provide a solid foundation upon which further advances can be made.