Machine-Generated Captions for Images using Deep Learning

  • Unique Paper ID: 158887
  • Volume: 9
  • Issue: 10
  • PageNo: 807-812
  • Abstract:
  • The primary objective of the picture caption generator is to automatically produce a suitable text or caption in English. The system's primary goal is to successfully provide appropriate captions for the provided picture. This study presents an image caption generator that, given an input picture, would identify its contents using beam search and greedy search to produce an English phrase. A pretrained deep learning CNN architecture exception model is used to learn image features, while a LSTM model is used to learn textual features, then integrates the results of both to produce a caption. To produce words, phrases, or captions for the provided photos, we use the LSTM model. Using the Convolutional Neural Network with Long Short-Term Memory, this model was created to create a caption generator for images. Features are extracted from the picture using a pre-trained version of VGG16. To create descriptive text for the pictures, LSTM acts as a decoder. This model has been taught to produce descriptive captions or words based on an input picture. The effectiveness of the model is measured by means of blue scores given to the system. The Keras library, NumPy, and Jupyter notebooks are discussed as tools for developing this project. We also talk about the picture categorization task, how CNNs are employed, and the Flickr dataset.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 9
  • Issue: 10
  • PageNo: 807-812

Machine-Generated Captions for Images using Deep Learning

Related Articles