Abstract: The aim of this study is to construct a system, capable of reading a collection of standard text file documents, performing semantic analysis on the documents and generating a similarity matrix used by the search engines, text corpus visualizations and a variety of other applications for filtering, sorting, clustering, retrieving and generally handling text. Methodologies used for this construction include statistical methods like Vector Space Model for document representation, latent semantic indexing method using singular value decomposition for dimension reduction and fuzzy measure for finding the similarity. The system could be implemented in Visual Basic integrated with MATLAB.
K. Vivekanandan and J. Suguna , 2008. Inferring Document Similarity using the Fuzzy Measure. Asian Journal of Information Technology, 7: 1-5.