Journal of Engineering and Applied Sciences

Year: 2019
Volume: 14
Issue: 2
Page No. 610 - 614

A Novel Method for Image Captioning Based on Attributes and External Knowledge

Authors : Maram Adil Ali Alaziz and Suhaam Adnan Abdul Kareem

References

Antol, S., A. Agrawal, J. Lu, M. Mitchell and D. Batra et al., 2015. VQA: Visual question answering. Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, IEEE, Santiago, Chile, ISBN:978-1-4673-8390-5, pp: 2425-2433.

Bahdanau, D., K. Cho and Y. Bengio, 2015. Neural machine translation by jointly learning to align and translate. Proceedings of the 2015 International Conference on Learning Representations, May 7-9, 2015, CBLS, San Diego, California, USA., pp: 1-9.

Chen, X. and Z.C. Lawrence, 2015. Mind's eye: A recurrent visual representation for image caption generation. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, IEEE, Boston, Massachusetts, USA., ISBN:978-1-4673-6964-0, pp: 2422-2431.

Cho, K., B. Van Merrienboer, C. Gulcehre, D. Bahdanau and F. Bougares et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. J. Comput. Lang., 1: 1-15.
Direct Link  |  

Devlin, J., H. Cheng, H. Fang, S. Gupta and L. Deng et al., 2015. Language models for image captioning: The quirks and what works. J. Comput. Lang., 1: 1-6.
Direct Link  |  

Donahue, J., H.L. Anne, S. Guadarrama, M. Rohrbach and S. Venugopalan et al., 2015. Long-term recurrent convolutional networks for visual recognition and description. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2015), June 7-12, 2015, IEEE, New York, USA., pp: 2625-2634.

Gao, H., J. Mao, J. Zhou, Z. Huang and L. Wang et al., 2015. Are you Talking to a Machine? Dataset and Methods for Multilingual Image Question. In: Advances in Neural Information Processing Systems, Cortes, C., N.D. Lawrence, D.D. Lee, M. Sugiyama and R. Garnett (Eds.). Curran Associates, Inc., Red Hook, USA., pp: 2296-2304.

Karpathy, A., A. Joulin and L.F. Fei-Fei, 2014. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping. In: Advances in Neural Information Processing Systems, Ghahramani, Z., M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger (Eds.). Curran Associates, New York, USA., pp: 1889-1897.

Krizhevsky, A., I. Sutskever and G.E. Hinton, 2012. Image net classification with deep convolutional neural networks. Proc. Neural Inf. Process. Syst., 1: 1097-1105.

LeCun, Y., L. Bottou, Y. Bengio and P. Haffner, 1998. Gradient-based learning applied to document recognition. Proc. IEEE, 86: 2278-2324.
CrossRef  |  

Malinowski, M., M. Rohrbach and M. Fritz, 2015. Ask your neurons: A neural-based approach to answering questions about images. Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, IEEE, Santiago, Chile, ISBN:978-1-4673-8390-5, pp: 1-9.

Mao, J., W. Xu, Y. Yang, J. Wang and Z. Huang et al., 2014. Deep captioning with multimodal recurrent neural networks (m-rnn). Proceedings of the International Conference on Learning Representations, June 11, 2015, ICLR, New Orleans, Louisiana, USA., pp: 1-17.

Simonyan, K. and A. Zisserman, 2014. Very deep convolutional networks for large-scale image recognition. J. Comput. Vision Pattern Recognit., 1: 1-14.
Direct Link  |  

Sutskever, I., O. Vinyals and Q.V. Le, 2014. Sequence to Sequence Learning with Neural Networks. In: Advances in Neural Information Processing Systems, Ghahramani, Z., M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger (Eds.). Curran Associates, Inc., Red Hook, New York, USA., pp: 3104-3112.

Szegedy, C., W. Liu, Y. Jia, P. Sermanet and S. Reed et al., 2015. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7-12, 2015, Boston, MA, USA., pp: 1-9.

Vinyals, O., A. Toshev, S. Bengio and D. Erhan, 2015. Show and tell: A neural image caption generator. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, IEEE, Boston, Massachusetts, USA., ISBN:978-1-4673-6963-3, pp: 3156-3164.

Yao, L., A. Torabi, K. Cho, N. Ballas and C. Pal et al., 2015. Describing videos by exploiting temporal structure. Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, IEEE, Santiago, Chile, ISBN:978-1-4673-8390-5, pp: 4507-4515.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved