Search papers, labs, and topics across Lattice.
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China
1
4
0
13
A Bidirectional Multimodal Fusion Network (BMFNet) for image captioning is proposed, which provides deep interaction and fusion of multiple features throughout the encoding and decoding process, and significantly enhances the model’s cross-modal reasoning capability.