个人信息
王东生
学位:理学博士学位

个人信息 Personal information

学历:南京大学 所在单位:理学院 电子邮箱:

Deep memory network for cross-modal retrieval

点击次数: 所属单位:计算机科学与技术学院/人工智能学院/软件学院 发表刊物:IEEE Trans Multimedia 摘要:With the explosive growth of multimedia data on the Internet, cross-modal retrieval has attracted a great deal of attention in both computer vision and multimedia communities. However, this task is challenging due to the heterogeneity gap between different modalities. Current approaches typically involve a common representation learning process that maps data from different modalities into a common space by linear or nonlinear embedding. Yet, most of them only handle the dual-modal situation and generalize poorly to complex cases that involve multiple modalities. In addition, they often require expensive fine-grained alignment of training data among diverse modalities. In this paper, we address these with a novel cross-modal memory network (CMMN), in which memory contents across modalities are simultaneously learned from end to end without the need of exact alignment. We further account for the diversity across multiple modalities using the strategy of adversarial learning. Extensive experimental results on several large-scale datasets demonstrate that the proposed CMMN approach achieves state-of-the-art performance in the task of cross-modal retrieval. © 1999-2012 IEEE. ISSN号:1520-9210 是否译文: 发表时间:2019-05-01 合写作者:Song, Ge,Wang, Dong,谭晓阳 通讯作者:Song, Ge,王东生