Hybrid Fusion-Based Machine Learning Framework for Enhanced Multimodal Sentiment Analysis

  • Unique Paper ID: 179628
  • PageNo: 8716-8722
  • Abstract:
  • Computational sentiment analysis seeks to equip machines with the capacity to recognise and interpret human affect. Within the acoustic modality, algorithms disentangle prosody, pitch and rhythm to infer the speaker’s emotional state. In text, polarity detection and pragmatic cues are used to detect positivity, negativity or neutrality—even in the presence of sarcasm and metaphor. Vision based approaches extend this capability to images, decoding facial micro expressions, contextual scenes and object configurations. When these three streams are fused, cross modal sentiment analysis emerges, allowing complementary cues to be modelled jointly. Such multimodal systems now underpin social media opinion mining, customer experience dashboards and adaptive content delivery platforms, offering richer and more reliable insight than any single modality alone.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{179628,
        author = {Dr M V Jagannatha Reddy},
        title = {Hybrid Fusion-Based Machine Learning Framework for Enhanced Multimodal Sentiment Analysis},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {11},
        number = {12},
        pages = {8716-8722},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=179628},
        abstract = {Computational sentiment analysis seeks to equip machines with the capacity to recognise and interpret human affect. Within the acoustic modality, algorithms disentangle prosody, pitch and rhythm to infer the speaker’s emotional state. In text, polarity detection and pragmatic cues are used to detect positivity, negativity or neutrality—even in the presence of sarcasm and metaphor. Vision based approaches extend this capability to images, decoding facial micro expressions, contextual scenes and object configurations. When these three streams are fused, cross modal sentiment analysis emerges, allowing complementary cues to be modelled jointly. Such multimodal systems now underpin social media opinion mining, customer experience dashboards and adaptive content delivery platforms, offering richer and more reliable insight than any single modality alone.},
        keywords = {Computational sentiment analysis, Acoustic emotion recognition, Polarity detection, Visual emotion recognition, Facial micro expressions, multimodal fusion, Social media opinion mining},
        month = {May},
        }

Cite This Article

Reddy, D. M. V. J. (2025). Hybrid Fusion-Based Machine Learning Framework for Enhanced Multimodal Sentiment Analysis. International Journal of Innovative Research in Technology (IJIRT), 11(12), 8716–8722.

Related Articles