An Improved Clustering Algorithm for Text Mining: Multi-Cluster Spherical K-Means
Citation
TUNALI, Volkan, Turgay BİLGİN & Ali Yılmaz ÇAMURCU."An Improved Clustering Algorithm for Text Mining: Multi-Cluster Spherical K-Means". The International Arab Journal of Information Technology, 13.1 (2016): 12-19.Abstract
Thanks to advances in information and communication technologies, there is a prominent increase in the amount of
information produced specifically in the form of text documents. In order to, effectively deal with this “information explosion”
problem and utilize the huge amount of text databases, efficient and scalable tools and techniques are indispensable. In this
study, text clustering which is one of the most important techniques of text mining that aims at extracting useful information by
processing data in textual form is addressed. An improved variant of spherical K-Means (SKM) algorithm named multi-cluster
SKM is developed for clustering high dimensional document collections with high performance and efficiency. Experiments
were performed on several document data sets and it is shown that the new algorithm provides significant increase in
clustering quality without causing considerable difference in CPU time usage when compared to SKM algorithm.