Abstract:
“Sentiment Analysis” (positive, negative polarity classification on the social media or customer reviews) is chosen as a major topic in the project, because
- it is a text-based classification task in Natural Language Processing (NLP);
- it can be analysed by classifiers in Machine Learning such as Support Vector Machine (linear classifier) and Naive Bayesian Model (probabilistic classifier);
- moreover, it can be explored by pre-trained state-of-art NLP model (BERT) from Transfer Learning.
Our Sentiment Analysis results show that primitive TF-IDF with Naive Bayes (80%) outperforms the start-of-art BERT (75%). Perhaps depending on the specific NLP task, not always contextualized embedding is required to capture the simple polarity.