Asian Journal of Information Technology

Year: 2019
Volume: 18
Issue: 2
Page No. 49 - 56

Large Vocabulary Arabic Continuous Speech Recognition using Tied States Acoustic Models

Authors : Mona A. Azim, A. Aziz A. Hamid, Nagwa L. Badr and M.F. Tolba

References

Abdo, M.S., A.H. Kandil and S.A. Fawzy, 2014. MFC peak based segmentation for continuous Arabic audio signal. Proceedings of the Middle East Conference on Biomedical Engineering (MECBME), February 17, 20, 2014, IEEE, Giza, Egypt, ISBN:978-1-4799-4799-7, pp: 224-227.

Almosallam, I., A. AlKhalifa, A.M. Ghamdi, M.I. Alkanhal and A. Alkhairy, 2013. SASSC: A standard Arabic single speaker corpus. Proceedongs of the ISCA Conference on SSW Synthesis Workshop, August 31-September 1, 2013, ISCA, Barcelona, Spain, pp: 249-253.

Alotaibi, Y.A., 2008. Comparative study of ANN and HMM to Arabic digits recognition systems. J. King Abdulaziz Univ. Eng. Sci., 19: 43-59.
Direct Link  |  

Azim, M.A., A.A.A. Hamid, N.L. Badr and M.F. Tolba, 2016. Tree-Based HMM state tying for Arabic continuous speech recognition. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, October 18, 2016, Springer, Berlin, Germany, ISBN:978-3-319-48307-8, pp: 96-103.

Azim, M.A., N.L. Badr and M.F. Tolba, 2016. An enhanced Arabic phonemes classification approach. Proceedings of the 10th International Conference on Informatics and Systems, May 9-11, 2016, ACM, Giza, Egypt, ISBN:978-1-4503-4062-5, pp: 210-214.

Azmi, M.M. and H. Tolba, 2008. Syllable-based automatic Arabic speech recognition in different conditions of noise. Proceedings of the 9th International Conference on Signal Processing ICSP08, October 26-29, 2008, IEEE, Egypt, ISBN:978-1-4244-2178-7, pp: 601-604.

Beulen, K. and H. Ney, 1998. Automatic question generation for decision tree based state tying. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 2, May 15, 1998, IEEE, Germany, ISBN:0-7803-4428-6, pp: 805-808.

Choubassi, M.M.E., E.H.E. Khoury, C.J. Alagha, J.A. Skaf and A.M.A. Alaoui, 2003. Arabic speech recognition using recurrent neural networks. Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology ISSPIT03, December 17, 2003, IEEE, Beirut, Lebanon, ISBN:0-7803-8292-7, pp: 543-547.

Elhadj, Y.O.M., M. Alghamdi and M. Alkanhal, 2014. Phoneme-Based Recognizer to Assist Reading the Holy Quran. In: Recent Advances in Intelligent Informatics, Thampi, S.M., A. Abraham, S.K. Pal and J.M.C. Rodriguez (Eds.). Springer, Berlin, Germany, ISBN:978-3-319-01777-8, pp: 141-152.

Elshafei, M., 1991. Toward an Arabic text-to-speech system. Arabian J. Sci. Eng., 16: 565-583.

Elshafei, M., H. Almuhtasib and M. Alghamdi, 2002. Techniques for high quality text-to-speech. Inf. Sci., 140: 255-267.

Fahad, A.H. and A. Otaibi, 2001. Speaker-dependant continuous Arabic speech recognition. MSc Thesis, King Saud University, Riyadh, Saudi Arabia.

Farghaly, A. and K. Shaalan, 2009. Arabic natural language processing: challenges and solutions. ACM Trans. Asian Language Inform. Process. Assoc. Comput. Mach., 8: 1-22.
CrossRef  |  

Forney, G.D., 1973. The viterbi algorithm. Proc. IEEE, 61: 268-278.
CrossRef  |  Direct Link  |  

Habash, N.Y., 2010. Introduction to Arabic natural language processing. Synth. Lectures Hum. Lang. Technol., 3: 1-18.

Hyassat, H. and R.A. Zitar, 2006. Arabic speech recognition using SPHINX engine. Intl. J. Speech Technol., 9: 133-150.
CrossRef  |  Direct Link  |  

Imperl, B., Z. Kacic, B. Horvat and A. Zgank, 2003. Clustering of triphones using phoneme similarity estimation for the definition of a multilingual set of triphones. Speech Commun., 39: 353-366.
Direct Link  |  

Jafri, A., I. Sobh and A. Alkhairy, 2015. Statistical formant speech synthesis for Arabic. Arabian J. Sci. Eng., 40: 3151-3159.

Lazarides, A., Y. Normandin and R. Kuhn, 1996. Improving decision trees for acoustic modeling. Proceedings of the 4th International Conference on Spoken Language ICSLP 96, Vol. 2, October 3-6, 1996, IEEE, Quebec, Canada, ISBN:0-7803-3555-4, pp: 1053-1056.

Mourtaga, E., A. Sharieh and M. Abdallah, 2007. Speaker independent Quranic recognizer based on maximum likelihood linear regression. Perform. Improv., 316: 61-67.

Nahar, K., A.H. Muhtaseb, A.W. Khatib, M. Elshafei and M. Alghamdi, 2015. Arabic phonemes transcription using data driven approach. Int. Arab J. Inf. Technol., 12: 237-245.

Nahar, K.M., A.W.G. Khatib, M. Elshafei, A.H. Muhtaseb and M.M. Alghamdi, 2013. Data-driven Arabic phoneme recognition using varying number of HMM states. Proceedings of the 2013 1st International Conference on Communications, Signal Processing and their Applications (ICCSPA), February 12-14, 2013, IEEE, Dhahran, Saudi Arabia, ISBN:978-1-4673-2820-3, pp: 1-6.

Nofal, M., A.E. Raheem, E.H. Henawy and N.A. Kader, 2004. Acoustic training system for speaker independent continuous Arabic speech recognition system. Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, December 18-21, 2004, IEEE, Cairo, Egypt, ISBN:0-7803-8689-2, pp: 200-203.

Odell, J.J., 1995. The use of context in large vocabulary speech recognition. Ph.D Thesis, Cambridge University, Cambridge, England.

Odell, J.J., P.C. Woodland and S.J. Young, 1994. Tree-based state clustering for large vocabulary speech recognition. Proceedings 1994 International Symposium on Speech, Image Processing and Neural Networks ISSIPNN'94, April 13-16, 1994, IEEE, Cambridge, England, ISBN:0-7803-1865-X, pp: 690-693.

Reichl, W. and W. Chou, 1998. Decision tree state tying based on segmental clustering for acoustic modeling. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 2, May 15, 1998, IEEE, New Jersey, USA., ISBN:0-7803-4428-6, pp: 801-804.

Satori, H., M. Harti and N. Chenfour, 2007. Arabic speech recognition system based on CMUSphinx. Proceedings of the International Symposium on Computational Intelligence and Intelligent Informatics ISCIII'07, March 28-30, 2007, IEEE, Morocco, ISBN:1-4244-1157-2, pp: 31-35.

Viterbi, A.J., 2006. A personal history of the Viterbi algorithm. IEEE. Signal Process. Mag., 23: 120-142.
CrossRef  |  Direct Link  |  

Young, S., G. Evermann, M. Gales, T. Hain and D. Kershaw et al., 2006. The HTK Book (v3.4). Cambridge University, Cambridge, England,.

Young, S., P. Woodland, G. Evermann and M. Gales, 2013. The HTK Toolkit 3.4.1. Cambridge University, Cambridge, England,.

Young, S.J. and S. Young, 1993. The HTK hidden Markov model toolkit: Design and philosophy. Ph.D Thesis, Department of Engineering, University of Cambridge, Cambridge, England.

Young, S.J., 1992. The general use of tying in phoneme-based HMM speech recognizers. Proceedings of the 1992 IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP-92, Vol. 1, March 23-26, 1992, IEEE, Cambridge, Massachusetts, ISBN:0-7803-0532-9, pp: 569-572.

Zgank, A., B. Horvat and Z. Kacic, 2005. Data-driven generation of phonetic broad classes, based on phoneme confusion matrix similarity. Speech Commun., 47: 379-393.
Direct Link  |  

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved