Journal of Engineering and Applied Sciences

Year: 2017
Volume: 12
Issue: 3
Page No. 468 - 474

Construction of Malay Abbreviation Corpus Based on Social Media Data

Authors : Nasiroh Omar, Ahmad Farhan Hamsani, Nur Atiqah Sia Abdullah and Siti Zaleha Zainal Abidin

References

Aho, A.V., 1980. Pattern Matching in Strings Formal Language Theory: Perspectives and Open Problems. Academic Press Inc, New York, USA., pp: 325-347.

Aw, A., M. Zhang, J. Xiao and J. Su, 2006. A phrase-based statistical model for SMS text normalization. Proceedings of the International Conference on Poster Sessions COLING-ACL on Main, July 17-18, 2006, ACM, Stroudsburg, PA, USA., pp: 33-40.

Aw, A.T. and L.H. Lee, 2012. Personalized normalization for a multilingual chat system. Proceedings of the International Conference on ACL 2012 System Demonstrations, July 10, 2012, ACM, Stroudsburg, USA., pp: 31-36.

Bali, R.M., C.C. Chong and K.N. Pek, 2007. Identifying and classifying unknown words in Malay texts. Master Thesis, Universiti Sains Malaysia, George Town, Malaysia.

Basri, S.B., R. Alfred and C.K. On, 2012. Automatic spell checker for Malay blog. Proceedings of the 2012 IEEE International Conference on Control System, Computing and Engineering (ICCSCE), November 23-25, 2012, IEEE, Kota Kinabalu, Malaysia, ISBN:978-1-4673-3143-2, pp: 506-510.

Branckaute, F., 2010. Facebook Statistics: The Numbers Game Continues. Airbnb Inc., San Francisco, California,.

Chen, T. and M.Y. Kan, 2013. Creating a live, public short message service corpus: The NUS SMS corpus. Lang. Resour. Eval., 47: 299-335.

Cook, P. and S. Stevenson, 2009. An unsupervised model for text message normalization. Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, June 4, 2009, ACM, Stroudsburg, PA, USA., ISBN:978-1-932432-36-7, pp: 71-78.

Gadde, P., R. Goutam, R. Shah, H.S. Bayyarapu and L.V. Subramaniam, 2011. Experiments with artificially generated noise for cleansing noisy text. Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data, September 17-17, 2011, ACM, New York, USA., ISBN:978-1-4503-0685-0, pp: 1-4.

Han, B., P. Cook and T. Baldwin, 2012. Automatically constructing a normalisation dictionary for microblogs. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, July 12-14, 2012, ACM, Stroudsburg, PA, USA., pp: 421-432.

Jehl, L.E., 2010. Machine translation for Twitter. MSc Thesis, University of Edinburgh, Edinburgh, Scotland.

Joseph, C., C. Muthusamy, A.S. Michael and D.S.D. Telajan, 2013. Strategies applied in SMS: An analysis on SMS column in the star newspaper. Asian Soc. Sci., 9: 8-13.
Direct Link  |  

Kayte, S.N. and M.A. Mundada, 2016. Corpus-driven marathi text-to-speech system based on the concatenative synthesis approach. Int. J. Eng. Res. Gen. Sci., 4: 14-20.
Direct Link  |  

Kobus, C., F. Yvon and G. Damnati, 2008. Normalizing SMS: Are two metaphors better than one?. Proceedings of the 22nd International Conference on Computational Linguistics, August 18-22, 2008, ACM, Stroudsburg, PA, USA., pp: 441-448.

Musa, H., R.A. Kadir, A. Azman and M.T. Abdullah, 2011. Syllabification algorithm based on syllable rules matching for Malay language. Proceedings of the 10th WSEAS International Conference on Applied Computer and Applied Computational Science, March 08- 10, 2011, ACM, Wisconsin, USA., ISBN:978-960-474-281-3, pp: 279-286.

MySQL, 2001. Developer zone. MySQL, Cupertino, California. http://dev.mysql.com/

Neubig, G., 2012. Unigram language models. Nara Institute of Science and Technology, Ikoma, Japan. http://www.phontron.com/slides/nlp-programming-en-01-unigramlm.pdf

Samsudin, N., M. Puteh, A.R. Hamdan and M.Z.A. Nazri, 2012. Normalization of common noisyterms in Malaysian online media. Proceedings of the Knowledge Management International Conference on (KMICe), July 4-6, 2012, Universiti Utara Malaysia, Changlun, Malaysia, pp: 515-520.

Samsudin, N., M. Puteh, A.R. Hamdan and M.Z.A. Nazri, 2013. Mining opinion in online messages. Int. J. Adv. Comput. Sci. Appl., 4: 19-24.

Smith, T.M.F., 1976. The foundations of survey sampling: A review. J. Royal Statist. Soc. Ser., 139: 183-204.
CrossRef  |  Direct Link  |  

Twitter, 2011. Twitter official blog. Twitter Inc, San Francisco, California. https://blog.twitter.com/2011/numbers

Wang, Y., Q. Min and S. Han, 2016. Understanding the effects of trust and risk on individual behavior toward social media platforms: A meta-analysis of the empirical evidence. Comput. Hum. Behav., 56: 34-44.
Direct Link  |  

Zesch, T., C. Muller and I. Gurevych, 2008. Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary. Technische Universität Darmstadt, Darmstadt, Germany, pp: 1646-1652.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved