The primary objective of the picture caption generator is to automatically produce a suitable text or caption in English. The system's primary goal is to successfully provide appropriate captions for the provided picture. This study presents an image caption generator that, given an input picture, would identify its contents using beam search and greedy search to produce an English phrase. A pretrained deep learning
CNN architecture exception model is used to learn image features, while a LSTM model is used to learn textual features, then integrates the results of both to produce a caption. To produce words, phrases, or captions for the provided photos, we use the LSTM
model. Using the Convolutional Neural Network with Long Short-Term Memory, this model was created to create a caption generator for images. Features are extracted from the picture using a pre-trained version of VGG16. To create descriptive text for the pictures,
LSTM acts as a decoder. This model has been taught to produce descriptive captions or words based on an input picture. The effectiveness of the model is measured by means of blue scores given to the system. The Keras library, NumPy, and Jupyter notebooks are
discussed as tools for developing this project. We also talk about the picture categorization task, how CNNs are employed, and the Flickr dataset.
Article Details
Unique Paper ID: 158887
Publication Volume & Issue: Volume 9, Issue 10
Page(s): 807 - 812
Article Preview & Download
Share This Article
Conference Alert
NCSST-2023
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2023
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT