Asian Journal of Information Technology
Year:
2016
Volume:
15
Issue:
14
Page No.
2355 - 2366
Experimental Investigation for Text Categorization Based on Hybrid Approach Using Feature Selection and Classification Techniques
Authors :
K. Sridharan
and
M. Chitra
References
Ahlqvist, O., 2008. Extending post-classification change detection using semantic similarity metrics to overcome class heterogeneity: A study of 1992 and 2001 US national land cover database changes. Remote Sens. Environ., 112: 1226-1241.
Direct Link | Amari, S. and S. Wu, 1999. Improving support vector machine classifiers by modifying kernel functions. Neural Networks, 12: 783-792.
CrossRef | PubMed | Belkin, M. and P. Niyogi, 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15: 1373-1396.
CrossRef | Berry, M.W., S.T. Dumais and G.W. O'Brien, 1995. Using linear algebra for intelligent information retrieval. SIAM. Rev., 37: 573-595.
CrossRef | Direct Link | Bizer, C., J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak and S. Hellmann, 2009. DBpedia-A crystallization point for the web of data. Web Semant.: Sci. Serv. Agents World Wide Web, 7: 154-165.
CrossRef | Direct Link | Byvatov, E., U. Fechner, J. Sadowski and G. Schneider, 2003. Comparison of support vector machine and artificial neural network systems for drug nondrug classification. J. Chem. Inf. Comput. Sci., 43: 1882-1889.
CrossRef | PubMed | Direct Link | Caldas, C.H. and L. Soibelman, 2003. Automating hierarchical document classification for construction management information systems. Autom. Constr., 12: 395-406.
Direct Link | Cohen, A.M. and W.R. Hersh, 2005. A survey of current work in biomedical text mining. Briefings Bioinf., 6: 57-71.
CrossRef | Direct Link | Damljanovic, D., M. Agatonovic and H. Cunningham, 2010. Combining Syntactic Analysis and Ontology-Based Lookup Through the User Interaction. In: The Semantic Web: Research and Applications. Lora, A., G. Antoniou, E. Hyvonen, A.T. Teije and S. Heiner
et al. (Eds.). Springer Berlin Heidelberg, Berlin, Germany, ISBN: 978-3-642-13485-2, pp: 106-120.
Dhillon, I.S. and D.S. Modha, 2001. Concept decompositions for large sparse text data using clustering. Mach. Learn., 42: 143-175.
CrossRef | Direct Link | Ensel, C. and A. Keller, 2012. An approach for managing service dependencies with xml and the resource description framework. J. Netw. Syst. Manag., 10: 147-170.
CrossRef | Direct Link | Fensel, D., I. Horrocks, V.F. Harmelen, D. McGuinness and P.P.F. Schneider, 2001. OIL: Ontology infrastructure to enable the Semantic Web. IEEE. Intell. Syst., 16: 38-45.
Frean, M., 1990. The upstart algorithm: A method for constructing and training feedforward neural networks. Neural Comput., 2: 198-209.
CrossRef | Direct Link | Gauch, S., J. Chaffee and A. Pretschner, 2003. Ontology-based personalized search and browsing. Web Intel. Agent Syst. Int. J., 1: 219-234.
Direct Link | Horrocks, I., P.F. Patel-Schneider and F.V. Harmelen, 2003. From SHIQ and RDF to OWL: The making of a web ontology language. J. Web Semantics, 1: 7-26.
CrossRef | Ishibuchi, H., K. Nozaki, N. Yamamoto and H. Tanaka, 1994. Construction of fuzzy classification systems with rectangular fuzzy rules using genetic algorithms. Fuzzy Sets Syst., 65: 237-253.
CrossRef | Direct Link | Jones, M.V., N. Coviello and Y.K. Tang, 2011. International entrepreneurship research (1989-2009): A domain ontology and thematic analysis. J. Bus. Venturing, 26: 632-659.
Direct Link | Kalousis, A., J. Prados and M. Hilario, 2007. Stability of feature selection algorithms: A study on high-dimensional spaces. Knowl. Inf. Syst., 12: 95-116.
CrossRef | Direct Link | Kambhatla, N. and T.K. Leen, 1997. Dimension reduction by local principal component analysis. Neural Comput., 9: 1493-1516.
CrossRef | Direct Link | Kohonen, T., S. Kaski, K. Lagus, J. Salojarvi and J. Honkela
et al., 2000. Self organization of a massive document collection. IEEE. Trans. Neural Netw., 11: 574-585.
CrossRef | Direct Link | Krallinger, M., A. Valencia and L. Hirschman, 2008. Linking genes to literature: Text mining, information extraction and retrieval applications for biology. Genome Biol., 9: 1-14.
CrossRef | Direct Link | Liu, L., J. Kang, J. Yu and Z. Wang, 2005. A comparative study on unsupervised feature selection methods for text clustering. Proceedings of the 2005 International Conference on Natural Language Processing and Knowledge Engineering, October 30-November 1, 2005, IEEE, Beijing, China, ISBN: 0-7803-9361-9, pp: 597-601.
Liu, Y., H.T. Loh and A. Sun, 2009. Imbalanced text classification: A term weighting approach. Expert Syst. Appl., 36: 690-701.
Direct Link | Mahgoub, H., D. Rosner, N. Ismail and F. Torkey, 2008. A text mining technique using association rules extraction. Int. J. Comput. Ontell., 4: 21-28.
Marinai, S., M. Gori and G. Soda, 2005. Artificial neural networks for document analysis and recognition. IEEE Trans. Pattern Anal. Mach. Intell., 27: 23-35.
CrossRef | Matsuo, Y. and M. Ishizuka, 2004. Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools, 13: 157-169.
Direct Link | Mittermayer, M.A., 2004. Forecasting intraday stock price trends with text mining techniques. Proceedings of the 37th Annual Hawaii International Conference on System Sciences, January 5-8, 2004, IEEE, Bern, Switzerland, ISBN: 0-7695-2056-1, pp: 1-10.
Pal, S.K., V. Talwar and P. Mitra, 2002. Web mining in soft computing framework: Relevance, state of the art and future directions. IEEE. Trans. Neural Netw., 13: 1163-1177.
CrossRef | PubMed | Direct Link | Peng, Y., G. Kou, Y. Shi and Z. Chen, 2008. A descriptive framework for the field of data mining and knowledge discovery. Int. J. Inf. Technol. Decis. Making, 7: 639-682.
Direct Link | Pop, I., 2006. An approach of the Naive Bayes classifier for the document classification. Gen. Math., 14: 135-138.
Direct Link | Pulido, J.R.G., M.A.G. Ruiz, R. Herrera, E. Cabello and S. Legrand
et al., 2006. Ontology languages for the semantic web: A never completely updated review. Knowl. Based Syst., 19: 489-497.
Direct Link | Rahm, E. and H.H. Do, 2000. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., 23: 1-11.
Direct Link | Ren, J., S.D. Lee, X. Chen, B. Kao and R. Cheng
et al., 2009. Naive bayes classification of uncertain data. Proceedings of the 2009 9th IEEE International Conference on Data Mining, December 6-9, 2009, IEEE, Hong Kong, China, ISBN: 978-1-4244-5242-2, pp: 944-949.
Simon, J., D.M. Santos, J. Fielding and B. Smith, 2006. Formal ontology for natural language processing and the integration of biomedical databases. Int. J. Med. Inf., 75: 224-231.
Direct Link | Sokolova, M. and G. Lapalme, 2009. A systematic analysis of performance measures for classification tasks. Inform. Process. Manage., 45: 427-437.
CrossRef | Soon, W.M., H.T. Ng and D.C.Y. Lim, 2001. A machine learning approach to coreference resolution of noun phrases. Comput. Ling., 27: 521-544.
CrossRef | Direct Link | Tan, C.M., Y.F. Wang and C.D. Lee, 2002. The use of bigrams to enhance text categorization. Inf. Process. Manage., 38: 529-546.
Direct Link | Yeh, J.Y., H.R. Ke, W.P. Yang and I.H. Meng, 2005. Text summarization using a trainable summarizer and latent semantic analysis. Inform. Process. Manage., 41: 75-95.
Direct Link | Zhang, M.L., J.M. Pena and V. Robles, 2009. Feature selection for multi-label naive Bayes classification. Inform. Sci., 179: 3218-3229.
CrossRef | Direct Link | Zhang, W., T. Yoshida and X. Tang, 2011. A comparative study of TF*IDF, LSI and multi-words for text classification Expert Syst. Applic., 38: 2758-2765.
CrossRef | Direct Link | Zhang, W., T. Yoshida, and X. Tang, 2008. Text classification based on multi-word with support vector machine. Knowledge-Based Syst., 21: 879-886.
CrossRef | Zhao, Y., 2012. R and Data Mining: Examples and Case Studies. Academic Press, USA., ISBN: 978-0-123-96963-7, Pages: 233.