MS-TR: A Morphologically Enriched Sentiment Treebank and Recursive Deep Models for Compositional Semantics in Turkish
Citation
ZEYBEK, Sultan, Ebubekir KOÇ & Aydın SEÇER. "MS-TR: A Morphologically Enriched Sentiment Treebank and Recursive Deep Models for Compositional Semantics in Turkish". Cogent Engineering, 8.1 (2021): 1-27.Abstract
Recursive Deep Models have been used as powerful models to learn
compositional representations of text for many natural language processing tasks.
However, they require structured input (i.e. sentiment treebank) to encode sentences
based on their tree-based structure to enable them to learn latent semantics
of words using recursive composition functions. In this paper, we present our
contributions and efforts for the Turkish Sentiment Treebank construction. We
introduce MS-TR, a Morphologically Enriched Sentiment Treebank, which was
implemented for training Recursive Deep Models to address compositional sentiment
analysis for Turkish, which is one of the well-known Morphologically Rich
Language (MRL). We propose a semi-supervised automatic annotation, as a distantsupervision
approach, using morphological features of words to infer the polarity of
the inner nodes of MS-TR as positive and negative. The proposed annotation model
has four different annotation levels: morph-level, stem-level, token-level, and
review-level. Each annotation level’s contribution was tested using three different
domain datasets, including product reviews, movie reviews, and the Turkish Natural
Corpus essays. Comparative results were obtained with the Recursive Neural Tensor Networks (RNTN) model which is operated over MS-TR, and conventional machine learning methods. Experiments proved that RNTN outperformed the baseline methods and achieved much better accuracy results compared to the baseline methods, which cannot accurately capture the aggregated sentiment information.