扫描手机二维码

欢迎您的访问
您是第 位访客

开通时间:..

最后更新时间:..

  • 袁家斌 ( 教授 )

    的个人主页 http://faculty.nuaa.edu.cn/yjb1/zh_CN/index.htm

  •   教授   博士生导师
  • 招生学科专业:
    计算机科学与技术 -- 【招收博士、硕士研究生】 -- 计算机科学与技术学院
    软件工程 -- 【招收博士、硕士研究生】 -- 计算机科学与技术学院
    网络空间安全 -- 【招收博士、硕士研究生】 -- 计算机科学与技术学院
    电子信息 -- 【招收硕士研究生】 -- 计算机科学与技术学院
论文成果 当前位置: 中文主页 >> 科学研究 >> 论文成果
Attention-based bidirectional recurrent neural networks for description generation of videos

点击次数:
所属单位:计算机科学与技术学院/人工智能学院/软件学院
发表刊物:Lect. Notes Comput. Sci.
摘要:Describing videos in human language is of vital importance in many applications, such as managing massive videos on line and providing descriptive video service (DVS) for blind people. In order to further promote existing video description frameworks, this paper presents an end-to-end deep learning model incorporating Convolutional Neural Networks (CNNs) and Bidirectional Recurrent Neural Networks (BiRNNs) based on a multimodal attention mechanism. Firstly, the model produces richer video representations, including image feature, motion feature and audio feature, than other similar researches. Secondly, BiRNNs model encodes these features in both forward and backward directions. Finally, an attention-based decoder translates sequential outputs of encoder to sequential words. The model is evaluated on Microsoft Research Video Description Corpus (MSVD) dataset. The results demonstrate the necessity of combining BiRNNs with a multimodal attention mechanism and the superiority of this model over other state-of-the-art methods conducted on this dataset. © Springer Nature Switzerland AG 2018.
ISSN号:0302-9743
是否译文:否
发表时间:2018-01-01
合写作者:Du, Xiaotong,刘晖
通讯作者:Du, Xiaotong,袁家斌

 

版权所有©2018- 南京航空航天大学·信息化处(信息化技术中心)