Title of Paper:Masked Diffusion Meets 3D: Cross-Modal Learning for Precise Anomaly Detection
Hits:
Journal:IEEE International Conference on Multimedia & Expo
Key Words:Multi-modal Anomaly detection, Diffusion re construction, Cross-modal Learning
Abstract:In industrial anomaly detection, the combination of 3D point cloud with RGB has become a focus of research. Existing methods, often based on multimodal fusion or reconstruction, struggle with generalization in complex industrial scenarios and fail to reconstruct high-quality normal objects, limiting their suitability. In this paper, we propose a framework called MCAD. First, we use a condition-guided diffusion process with masked reconstruction for high-quality RGB features. Secondly, we train a lightweight network to learn 3D point-cloud features from RGB data. Finally, anomalies can be detected by fusing intra-modal reconstruction errors and cross-modal learning errors, achieving trade-offs in inference accuracy, computational efficiency, and memory space. As we know at present, our study is the first to evaluate diffusion frameworks in multimodal anomaly detection datasets, achieving state-of-the-art performance with image-level AUROC scores of 96.4% and 91.2% on MVTec 3D and Eyecandies datasets, respectively, marking a significant advance in diffusion-based anomaly detection.
Discipline:Engineering
Document Type:C
Number of Words:3000
Translation or Not:no
Included Journals:SCI
Correspondence Author:冯爱民
Open time:..
The Last Update Time: ..