南京航空航天大学主页平台管理系统 Amy Feng--Home-- Masked Diffusion Meets 3D: Cross-Modal Learning for Precise Anomaly Detection

Amy Feng

Associate Professor
Supervisor of Master's Candidates

MORE>

Language：English

中文

Paper Publications

Title of Paper:Masked Diffusion Meets 3D: Cross-Modal Learning for Precise Anomaly Detection

Hits:

Journal:IEEE International Conference on Multimedia & Expo

Key Words:Multi-modal Anomaly detection, Diffusion re construction, Cross-modal Learning

Abstract:In industrial anomaly detection, the combination of 3D point cloud with RGB has become a focus of research. Existing methods, often based on multimodal fusion or reconstruction, struggle with generalization in complex industrial scenarios and fail to reconstruct high-quality normal objects, limiting their suitability. In this paper, we propose a framework called MCAD. First, we use a condition-guided diffusion process with masked reconstruction for high-quality RGB features. Secondly, we train a lightweight network to learn 3D point-cloud features from RGB data. Finally, anomalies can be detected by fusing intra-modal reconstruction errors and cross-modal learning errors, achieving trade-offs in inference accuracy, computational efficiency, and memory space. As we know at present, our study is the first to evaluate diffusion frameworks in multimodal anomaly detection datasets, achieving state-of-the-art performance with image-level AUROC scores of 96.4% and 91.2% on MVTec 3D and Eyecandies datasets, respectively, marking a significant advance in diffusion-based anomaly detection.

Discipline:Engineering

Document Type:C

Number of Words:3000

Translation or Not:no

Included Journals:SCI

Correspondence Author:冯爱民

Pre One: Information Aggregation Semantic Adversarial Network for Cross-Modal Retrieval

Next One:GRAD:Bi-Grid Reconstruction for Image Anomaly Detection