Asian Journal of Information Technology

Year: 2005
Volume: 4
Issue: 3
Page No. 34 - 40

An Effective Approach to the Evaluation and Construction of Training Corpus for Text Classification

Authors : Jihong Guan and Shuigeng Zhou

Abstract: Text classification is becoming more and more important with the rapid growth of on-line information available. It was observed that the quality of training corpus impacts the performance of the trained classifier. This paper proposes an approach to build high-quality training corpuses for better classification performance by first exploring the properties of training corpuses, and then giving an algorithm for constructing training corpuses semi-automatically. Preliminary experimental results validate our approach: classifiers based on the training corpuses constructed by our approach can achieve good performance while the training corpus` size is significantly reduced. Our approach can be used for building efficient and lightweight classification systems.

How to cite this article:

Jihong Guan and Shuigeng Zhou , 2005. An Effective Approach to the Evaluation and Construction of Training Corpus for Text Classification . Asian Journal of Information Technology, 4: 34-40.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved