CleanVisionMed: Trust-Based Data Quality Assessment for Dermatological Image Classification

Nurlan Aghazada

Authors

Nurlan Aghazada Azerbaijan State Oil and Industry University, Department of Mechatronics and Robotics, Master student

Keywords:

skin lesion classification, data quality, trust score, label noise, image sharpness

Abstract

Accurate skin lesion classification relies not only on powerful neural networks but also on the quality of the images and the trustworthiness of their labels. Publicly available collections like DermaMNIST are invaluable, yet they inevitably contain mislabeled cases and blurred or low-contrast images that can misdirect a learning algorithm. In this work, we present CleanVisionMed, a lightweight and fully automated pipeline designed to assess and elevate data quality through a unified trust score, with the ultimate goal of boosting model robustness without resorting to manual curation or complicated training tricks.

Our framework fuses two simple but complementary measures. First, we leverage CleanLab’s label‐noise detection to flag examples that carry a high risk of misannotation. Second, we compute a Laplacian‐blur metric on each image to quantify visual sharpness, catching out-of-focus or noisy photographs. By normalizing and combining these signals into a single trust score, we rank all training samples from most to least reliable. Rather than discarding entire categories or hand‐picking images, CleanVisionMed filters out only those samples that fall below a chosen trust‐score threshold, thereby retaining as much useful data as possible.

To test this approach, we worked exclusively with the DermaMNIST training set, which contains 7,007 images spanning ten skin condition classes. We defined thresholds between 0.00 (no filtering) and 0.80, removing progressively larger fractions of the dataset. At each level, a ResNet-18 model was trained for ten epochs, and performance was assessed on a fixed validation split. Key metrics—number of samples retained, validation accuracy, and final training loss—were recorded to reveal how trust‐based pruning shapes learning.

The results showed a nuanced balance between quantity and quality. Mild filtering at thresholds up to 0.40 had little impact, nudging accuracy around the low-30 percent range. However, at a threshold of 0.70, we observed a dramatic jump in validation accuracy to 42.23%, rising from 31.58% with no filtering—a gain of over ten percentage points. Remarkably, this boost came with only a 40% reduction in data, leaving 4,228 highly trustworthy images to guide training. Moving the bar to 0.80 saved just under 3,840 samples, but it didn't make the accuracy any better, which means that the returns start to drop off after 0.70. On the other hand, filtering too aggressively at 0.60 took out too much information and dropped accuracy to 16.23%, which shows how easy it is to underfit when good data is also taken out.

To ensure these gains were not confined to the validation set, we evaluated the model trained at the 0.70 threshold on the held‐out test split. Its performance—between 41% and 42%—closely matched validation results, confirming that CleanVisionMed’s benefits extend to unseen data. Across all thresholds, training loss tracked expected trends: moderate filtering reduced confusion during fitting, while extreme data removal led to erratic learning and higher final losses.

The fact that CleanVisionMed is easy to use is what makes it so good. By focusing on the most reliable images and labels, it gives practitioners the tools they need to make big improvements in performance with little extra work. This method is especially useful in medical imaging, where it is time-consuming and costly to annotate and verify large datasets. Researchers can use a trust-score filter as a first line of defense against noisy inputs instead of spending money on more complicated model architectures or gathering a lot of data.

In short, our research shows that a simple, trust-based filtering system can lead to double-digit improvements in the classification of dermatological images on DermaMNIST. CleanVisionMed has a scalable, domain-agnostic plan for improving data quality. This will make medical AI systems more reliable and useful without the need for a lot of manual work.

CleanVisionMed: Trust-Based Data Quality Assessment for Dermatological Image Classification

Authors

Keywords:

Abstract

Published

How to Cite

Issue

Section

License