Audio Classification of Cats and Dogs using Python

  • Unique Paper ID: 179134
  • PageNo: 5366-5371
  • Abstract:
  • This paper presents a novel edge-optimized deep learning framework for real-time classification of cat and dog vocalizations, addressing key challenges in residential audio monitoring. Leveraging a MobileNetV2-inspired CNN architecture trained on MFCC features (Davis & Mermelstein, 1980), our solution achieves 94.32% accuracy (F1=0.93) while reducing model size by 43% compared to ResNet-18 baselines. The pipeline incorporates: • Robust preprocessing: Noise filtering + adaptive segmentation • Targeted augmentation: Time stretching (±20%) and pitch shifting (±2 semitones) • Edge deployment: <3s inference on Raspberry Pi (validated via stratified cross-validation) Outperforming SVM approaches by 12.7% (p<0.01), this work enables practical applications in smart pet care and veterinary acoustics. Future extensions will explore IoT integration and multi-species classification.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{179134,
        author = {Ridhi Rajput and Dr. Pallavi Goel and Navin Kumar Yadav and Pummy Kumari},
        title = {Audio Classification of Cats and Dogs using Python},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {11},
        number = {12},
        pages = {5366-5371},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=179134},
        abstract = {This paper presents a novel edge-optimized deep learning framework for real-time classification of cat and dog vocalizations, addressing key challenges in residential audio monitoring. Leveraging a MobileNetV2-inspired CNN architecture trained on MFCC features (Davis & Mermelstein, 1980), our solution achieves 94.32% accuracy (F1=0.93) while reducing model size by 43% compared to ResNet-18 baselines. The pipeline incorporates:
•	Robust preprocessing: Noise filtering + adaptive segmentation
•	Targeted augmentation: Time stretching (±20%) and pitch shifting (±2 semitones)
•	Edge deployment: <3s inference on Raspberry Pi (validated via stratified cross-validation)
Outperforming SVM approaches by 12.7% (p<0.01), this work enables practical applications in smart pet care and veterinary acoustics. Future extensions will explore IoT integration and multi-species classification.},
        keywords = {Audio Classification, Cat and Dog Sounds, Deep Learning, Spectrogram, Convolutional Neural Networks, Python, Librosa, TensorFlow},
        month = {May},
        }

Cite This Article

Rajput, R., & Goel, D. P., & Yadav, N. K., & Kumari, P. (2025). Audio Classification of Cats and Dogs using Python. International Journal of Innovative Research in Technology (IJIRT), 11(12), 5366–5371.

Related Articles