HOME JOURNALS CONTACT

Journal of Engineering and Applied Sciences

Ontology Based Text Document Clustering for Sports
A. Sudha Ramkumar, B. Poorna and B. Saleena

Abstract: Text document clustering is used to group a set of documents based on the information it contains and to provide retrieval results when a user browses the internet. Majority of the text document clustering algorithms clusters the documents based on the terms and the frequency of occurrence of those terms and do not consider the meaning among terms because of this clustering performance decreases in terms of precision and recall. To overcome this problem, this research proposes ontology based text document clustering in which documents related to sports have been clustered semantically using sports domain. In this study, sports domain ontology along with WordNet ontology, the lexical database has been used to improve the quality of clustering. With the help of WordNet ontology, the terms and their relevant terms has been retrieved by the synonym retrieval algorithm. This study proposes how these terms along with the relevant terms when it applied to k-means clustering algorithm will improve the performance of the clustering process. Experimental evidence has been shown to prove how the ontology based clustering approach significantly improves the performance of clustering over traditional k-means approach and k-means with dimension reduction technique in terms of precision, recall and accuracy for the bbc dataset

How to cite this article
A. Sudha Ramkumar, B. Poorna and B. Saleena, 2018. Ontology Based Text Document Clustering for Sports. Journal of Engineering and Applied Sciences, 13: 4073-4079.

© Medwell Journals. All Rights Reserved