Semi-Supervised Classification of Climate Change & Population Related Tweets based on Hash Tags and Accounts annotation

  • Unique Paper ID: 156901
  • Volume: 9
  • Issue: 5
  • PageNo: 294-302
  • Abstract:
  • Many researchers work on climate change related tweets to predict and determine whether climate change is real or hoax based on sentiment analysis using labeled dataset. Others work on predefined climate change denier hash tags or denier twitter accounts to annotate these tweets. This paper illustrates the combine usages of denier hash tags and predefined twitter accounts that are primarily denier of climate change propaganda in Twitter. Denier accounts and hash tags collected from across various papers, articles are used along with additional wild character search techniques of seed phrases related to climate change and population growth related words or phrases together and annotated the unlabeled dataset extracted from Twitter. It is an automatic annotation technique. This annotated subset of data is used to train baseline Supervised Classification Models in combination with two types of frequency-based word vectorizers to analyze the performance measure of each model with different feature variation of n-gram combination. As per the analysis Linear Support Vector Machine algorithm along with word Count Vectorizer of Unigram and Bigram combination score the best performance on the annotated dataset that is being used.

Related Articles