Asian Journal of Information Technology

Year: 2016
Volume: 15
Issue: 19
Page No. 3770 - 3779

A Novel Entropy Based Algorithm to Remove Silence from Speech and Classifying the Residue as Voiced/unvoiced Regions

Authors : R. Johny Elton, P. Vasuki and J. Mohanalin

References

Arifianto, D., 2007. Dual parameters for voiced-unvoiced speech signal determination. Proceedings of the 2007 IEEE International Conference on Acoustics Speech and Signal Processing-ICASSP'07, April 15-20, 2007, IEEE, Honolulu, Hawaii, ISBN: 1-4244-0727-3, pp: 749-752.

Atal, B. and L. Rabiner, 1976. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE. Trans. Acoust. Speech Signal Process., 24: 201-212.
CrossRef  |  Direct Link  |  

Beenamol, M., S. Prabavathy and J.M ohanalin, 2012. Wavelet based seismic signal de-noising using Shannon and Tsallis entropy. Comput. Math. Applic., 64: 3580-3593.
CrossRef  |  

Bosch, L.T., 2003. Emotions speech and the ASR framework. Speech Commun., 40: 213-225.
Direct Link  |  

Childers, D.G. and C.K. Lee, 1991. Vocal quality factors: Analysis synthesis and perception. J. Acoustical Soc. Am., 90: 2394-2410.
Direct Link  |  

Childers, D.G., M. Hahn and J.N. Larar, 1989. Silent and voiced-unvoiced-mixed excitation (four-way) classification of speech. IEEE. Trans. Acoust. Speech Signal Process., 37: 1771-1774.
CrossRef  |  Direct Link  |  

D'Alessandro, C., V. Darsinos and B. Yegnanarayana, 1998. Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources. IEEE. Trans. Speech Audio Process., 6: 12-23.
CrossRef  |  Direct Link  |  

DeKrom, G., 1993. A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. J. Speech Lang. Hearing Res., 36: 254-266.
CrossRef  |  Direct Link  |  

Ercelebi, E., 2003. Second generation wavelet transform-based pitch period estimation and voiced-unvoiced decision for speech signals. Appl. Acoust., 64: 25-41.
Direct Link  |  

Erkelens, J.S. and P.M. Broersen, 1998. LPC interpolation by approximation of the sample autocorrelation function. IEEE. Trans. Speech Audio Process., 6: 569-573.

Hariharan, M., C.Y. Fook, R. Sindhu, A.H. Adom and S. Yaacob, 2013. Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy. Digital Signal Process., 23: 952-959.
CrossRef  |  Direct Link  |  

Hermes, D.J., 1991. Synthesis of breathy vowels: Some research methods. Speech Commun., 10: 497-502.
CrossRef  |  Direct Link  |  

Holmberg, E.B., J.S. Perkell, R.E. Hillman and C. Gress, 1994. Individual variation in measures of voice. Phonetica, 51: 30-37.
CrossRef  |  Direct Link  |  

Klatt, D.H. and L.C. Klatt, 1990. Analysis synthesis and perception of voice quality variations among female and male talkers. J. Acoust. Soc. Am., 87: 820-857.
Direct Link  |  

Kosko, B., 1986. Fuzzy entropy and conditioning. Inf. Sci., 40: 165-174.
CrossRef  |  Direct Link  |  

Krishnamoorthy, P. and S.M. Prasanna, 2011. Enhancement of noisy speech by temporal and spectral processing. Speech Commun., 53: 154-174.
Direct Link  |  

Maier, A., T. Haderlein, U. Eysholdt, F. Rosanowski and A. Batliner et al., 2009. PEAKS-A system for the automatic evaluation of voice and speech disorders. Speech Commun., 51: 425-437.
Direct Link  |  

Naylor, P.A., A. Kounoudes, J. Gudnason and M. Brookes, 2007. Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE. Trans. Audio Speech Lang. Process., 15: 34-43.
CrossRef  |  Direct Link  |  

Paulikas, S. and D. Navakauskas, 2005. Restoration of voiced speech signals preserving prosodic features. Speech Commun., 47: 457-468.
Direct Link  |  

Pinto, N.B., D.G. Childers and A.L. Lalwani, 1989. Formant speech synthesis: Improving production quality. IEEE. Trans. Acoustics Speech Signal Process., 37: 1870-1887.
CrossRef  |  Direct Link  |  

Qaimkhani, I.A. and E. Hossain, 2008. Efficient silence suppression and call admission control through contention-free medium access for VoIP in WiFi networks. IEEE. Commun. Mag., 46: 90-99.
CrossRef  |  Direct Link  |  

Qi, Y. B.R. Hunt, 1993. Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier. IEEE. Trans. Speech Audio Process., 1: 250-255.
Direct Link  |  

Richman, J.S. and J.R. Moorman, 2000. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circulatory Physiol., 278: H2039-H2049.
PubMed  |  Direct Link  |  

Rouat, J., Y.C. Liu and D. Morissette, 1997. A pitch determination and voiced-unvoiced decision algorithm for noisy speech. Speech Commun., 21: 191-207.
CrossRef  |  Direct Link  |  

Shannon, C.E., 1948. A mathematical theory of communications. Bell Syst. Tech. J., 27: 379-423.
Direct Link  |  

Siegel, L. and A. Bessey, 1982. Voiced-unvoiced-mixed excitation classification of speech. IEEE. Trans. Acoust. Speech Signal Process., 30: 451-460.
CrossRef  |  Direct Link  |  

Strik, H. and C. Cucchiarini, 1999. Modeling pronunciation variation for ASR: A survey of the literature. Speech Commun., 29: 225-246.
Direct Link  |  

Yegnanarayana, B. and K.S.R. Murty, 2009. Event-based instantaneous fundamental frequency estimation from speech signals. IEEE. Trans. Audio Speech Lang. Process., 17: 614-624.
CrossRef  |  Direct Link  |  

Yegnanarayana, B., D.C. Alessandro and V. Darsinos, 1998. An iterative algorithm for decomposition of speech signals into periodic and aperiodic components. IEEE. Trans. Speech Audio Process., 6: 1-11.
CrossRef  |  Direct Link  |  

Yin, B., E. Ambikairajah and F. Chen, 2009. Voiced-unvoiced pattern-based duration modeling for language identification. Proceedings of the 2009 IEEE International Conference on Acoustics Speech and Signal Processing, April 19-24, 2009, IEEE, Taipei, Taiwan, ISBN: 978-1-4244-2353-8, pp: 4341-4344.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved