This paper presents a new scheme for term selection in the field of emotion recognition from text. The proposed framework is based on utilizing moderately frequent terms during term selection. More specifically, all terms are evaluated by considering their relevance scores, based on the idea that moderately frequent terms may carry valuable information for discrimination as well. The proposed feature selection scheme performs better than conventional filter-based feature selection measures Chi-Square and Gini-Text in numerous cases. The bag-of-words approach is used to construct the vectors for document representation where each selected term is assigned the weight 1 if it exists or assigned the weight 0 if it does not exist in the document. The proposed scheme includes the terms that are not selected by Chi-Square and Gini-Text. Experiments conducted on a benchmark dataset show that moderately frequent terms boost the representation power of the term subsets as noticeable improvements are observed in terms of Accuracies.
CC BY 4.0