Image captioning, Deep neural network, VGGNet, RNN-LSTM, CNN
Abstract
Automatically creating a caption for a picture is known as image captioning. It is becoming more popular as a freshly developed scientific field. The semantic content of images must be gathered and communicated in natural languages in order to accomplish the purpose of image captioning. Image captioning is a difficult task since it bridges the computer vision and natural language processing research fields. Different strategies have been put forth to address this issue. We give a survey of developments in picture captioning research in this publication. We categorize various techniques to image captioning into distinct groups based on the strategy used. Each category’s representative approaches are outlined along with their advantages and disadvantages. In this paper, we first go over retrieval and template-based strategies that were often employed in earlier research. Then, as these techniques produce state-of-the-art outcomes, we concentrate mostly on neural network-based approaches. Based on the particular framework they employ, neural network-based solutions are further subdivided. There is a detailed discussion of each subcategory of neural network-based techniques. Following that, benchmark datasets are used to compare state-of-the-art approaches. The discussion of potential directions for further study is then offered.
Article Details
Unique Paper ID: 157260
Publication Volume & Issue: Volume 9, Issue 6
Page(s): 451 - 463
Article Preview & Download
Share This Article
Conference Alert
NCSST-2023
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2023
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT