Journal of Engineering and Applied Sciences

Year: 2018
Volume: 13
Issue: 16
Page No. 6680 - 6685

Very Deep Convolutional Neural Network for Speech Recognition Based on Words

Authors : Javier O. Pinzon, Robinson Jimenez-Moreno, Oscar Aviles, Paola Nino and Diana Ovalle

References

Abdel-Hamid, O., A.R. Mohamed, H. Jiang and G. Penn, 2012. Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 25-30, 2012, IEEE, Kyoto, Japan, ISBN:978-1-4673-0045-2, pp: 4277-4280.

Abdel-Hamid, O., A.R. Mohamed, H. Jiang, L. Deng and G. Penn et al., 2014. Convolutional neural networks for speech recognition. IEEE. ACM. Trans. Audio Speech lang. Process., 22: 1533-1545.
CrossRef  |  Direct Link  |  

Deng, L. and X. Li, 2013. Machine learning paradigms for speech recognition: An overview. IEEE. Trans. Audio, Speech Lang. Process., 21: 1060-1089.
CrossRef  |  Direct Link  |  

Deng, L., G. Hinton and B. Kingsbury, 2013. New types of deep neural network learning for speech recognition and related applications: An overview. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 26-31, 2013, IEEE, Vancouver, British Columbia, Canada, ISBN:978-1-4799-0356-6, pp: 8599-8603.

Girshick, R., J. Donahue, T. Darrell and J. Malik, 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, ACM, Washington, DC, USA., ISBN:978-1-4799-5118-5, pp: 580-587.

Gupta, H.R. and R. Mehra, 2013. Power spectrum estimation using Welch method for various window techniques. Intl. J. Sci. Res. Eng. Technol. IJSRET., 2: 389-392.
Direct Link  |  

Hinton, G., L. Deng, D. Yu, G.E. Dahl and A.R. Mohamed et al., 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE. Signal Process. Mag., 29: 82-97.
CrossRef  |  Direct Link  |  

Hsu, W.N., Y. Zhang, A. Lee and J.R. Glass, 2016. Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. Proceedings of the International Conference on Interspeech, September 8-12, 2016, University of California, San Francisco, California, USA., pp: 395-399.

Krizhevsky, A., I. Sutskever and G.E. Hinton, 2012. Imagenet Classification with Deep Convolutional Neural Networks. In: Advances in Neural Information Processing Systems, Leen, T.K., G.D. Thomas and T. Volker (Eds.). MIT Press, Cambridge, Massachusetts, USA., ISBN:0-262-12241-3, pp: 1097-1105.

LeCun, Y., L. Bottou, Y. Bengio and P. Haffner, 1998. Gradient-based learning applied to document recognition. Proc. IEEE, 86: 2278-2324.
CrossRef  |  

Mohamed, A.R., G.E. Dahl and G. Hinton, 2012. Acoustic modeling using deep belief networks. IEEE. Trans. Audio Speech Lang. Process., 20: 14-22.
CrossRef  |  Direct Link  |  

Ondruska, P., J. Dequaire, D.Z. Wang and I. Posner, 2016. End-to-end tracking and semantic segmentation using recurrent neural networks. Master Thesis, Cornell University, Ithaca, New York, USA.

Orozco, I., M.E. Buemi and J.J. Berlles, 2016. A study on pedestrian detection using a deep convolutional neural network. Proceedings of the International Conference on Pattern Recognition Systems (ICPRS-16), April 20-22, 2016, IET, Talca, Chile, ISBN:978-1-78561-283-1, pp: 1-15.

Qian, Y. and P.C. Woodland, 2016. Very deep convolutional neural networks for robust speech recognition. Proceedings of the 2016 IEEE International Workshop on Spoken Language Technology (SLT), December 13-16, 2016, IEEE, San Diego, California, USA., ISBN:978-1-5090-4903-5, pp: 481-488.

Sainath, T.N., B. Kingsbury, G. Saon, H. Soltau and A.R. Mohamed et al., 2015. Deep convolutional neural networks for large-scale speech tasks. Neural Networks, 64: 39-48.
Direct Link  |  

Schmidhuber, J., 2015. Deep learning in neural networks: An overview. Neural Networks, 61: 85-117.
Direct Link  |  

Seide, F., G. Li and D. Yu, 2011. Conversational speech transcription using context-dependent deep neural networks. Proceedings of the 12th Annual International Conference on International Speech Communication Association, August 28-31, 2011, ISCA, Florence, Italy, pp: 437-440.

Sercu, T., C. Puhrsch, B. Kingsbury and Y. LeCun, 2016. Very deep multilingual convolutional neural networks for LVCSR. Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 20-25, 2016, IEEE, Shanghai, China, ISBN:978-1-4799-9988-0, pp: 4955-4959.

Simonyan, K. and A. Zisserman, 2014. Very deep convolutional networks for large-scale image recognition. Master Thesis, Cornell University, Ithaca, New York.

Weibel A., T. Hanazawa, G. Hinton and K. Shinkano, 1989. Phoneme recognition using time-delay neural networks. IEEE Trans. ASSP, 37: 328-339.
Direct Link  |  

Yoshioka, T., K. Ohnishi, F. Fang and T. Nakatani, 2016. Noise robust speech recognition using recent developments in neural networks for computer vision. Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 20-25, 2016, IEEE, Shanghai, China, ISBN:978-1-4799-9988-0, pp: 5730-5734.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved