Ensemble of Data Augmentation Techniques for Efficient 3 Augmentation in NLP

  • Unique Paper ID: 165489
  • Volume: 11
  • Issue: 1
  • PageNo: 2706-2734
  • Abstract:
  • In the last decade, NLP has made significant advances in machine learning. In so many machine learning scenarios, there isn't enough data available to train a good classifier. Data augmentation can indeed be utilized to solve this problem. It utilizes transformations to artificially increase the amount of available training data. Due of linguistic data's discrete character, this topic is still relatively underexplored, in spite of the huge rise in usage. A major goal of the DA techniques is to increase the diversity of training data, allowing the model to better generalize when faced with novel testing data. This study uses the term "data augmentation" to allude as a broad concept that encompasses techniques for transforming training data. While most text data augmentation research focuses on the long-term aim of developing end-to-end learning solutions, this study focuses on using pragmatic, robust, scalable, and easy-to-implement data augmentation techniques comparable to those used in computer vision. In natural language processing, simple but successful data augmentation procedures have been implemented and inspired by such efforts, we construct and compare ensemble data augmentation for NLP classification. We are proposing an ensembling of simple yet effective data augmentation techniques. Through experiments on various dataset from kaggle, we show that ensembling of augmentation can boost performance with any text embedding technique particularly for small training sets. We conclude by carrying out experiments on a classification datasets. Based on the results, we draw conclusion that Effective DA approach by ensembles of data augmentation can help practitioners choose suitable augmentation technique in different settings.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 11
  • Issue: 1
  • PageNo: 2706-2734

Ensemble of Data Augmentation Techniques for Efficient 3 Augmentation in NLP

Related Articles