Abstract: Many systematic approaches have been studied to represent documents. One of the approaches, called Phrase based Technique (PHT) uses phrases to represent documents, aiming at capturing the main phrases present in the document. A set of phrases is represented as an ATN. One of the main problems in this approach is to construct ATN that can capture all possible patterns. This study provides a frame work that is essential for capturing the most of the patterns used in English Language and proposes a way to automatically represent documents. Experiments have been performed on small set of documents and shown that phrases are more effective than keywords in terms of content indicators.
S. Srinivasan and P. Thambidurai , 2006. Phrase Based Approach for Document Representation. Asian Journal of Information Technology, 5: 61-64.