Recipe Generation from Food Images Using CNN

  • Unique Paper ID: 180401
  • Volume: 12
  • Issue: 1
  • PageNo: 1413-1420
  • Abstract:
  • In this paper, we present an end-to-end deep learning system for inverse cooking, which involves generating complete recipes—including the dish title, a list of ingredients, and detailed cooking instructions—solely from a single input image of a food item. The core of our system employs a Convolutional Neural Network (CNN) to extract high-level visual features from the food image. These features are then processed by two specialized decoders: the first performs multi-label ingredient prediction, while the second generates a coherent sequence of natural language instructions describing the cooking process. The model is trained and evaluated on the Recipe1M dataset, a large-scale benchmark consisting of over one million recipes paired with corresponding images. The CNN used for feature extraction is based on the ResNet-50 architecture, pre-trained on ImageNet and fine-tuned for food-specific visual understanding. The ingredient decoder outputs a set of probable ingredients using sigmoid-based classification, while the instruction decoder generates the procedural steps using a sequence-to-sequence language model with attention. We have also developed a fully functional web application using the Flask framework, allowing users to upload food images and receive predicted recipes in real time through a user-friendly browser interface. The system demonstrates strong performance, achieving an average F1 score of approximately 0.82 in ingredient prediction, with precision and recall values of 0.85 and 0.80, respectively. For instruction generation, we report BLEU-4 scores that are competitive with, and in some cases exceed, those produced by existing state-of-the-art models. Visual outputs for various sample dishes, including cheeseburgers and Rajma-Rice, are included in the paper to illustrate the system's effectiveness. These examples validate the system’s capability to identify the core ingredients and generate accurate, contextually relevant cooking instructions. The research highlights the potential of combining computer vision and natural language processing for real-world culinary applications, opening the door to intelligent food assistants and automated cooking guidance systems.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 12
  • Issue: 1
  • PageNo: 1413-1420

Recipe Generation from Food Images Using CNN

Related Articles

Impact Factor
8.01 (Year 2024)

Join Our IPN

IJIRT Partner Network

Submit your research paper and those of your network (friends, colleagues, or peers) through your IPN account, and receive 800 INR for each paper that gets published.

Join Now

Recent Conferences

NCSEM 2024

National Conference on Sustainable Engineering and Management - 2024 Last Date: 15th March 2024

Submit inquiry