Journal of Engineering and Applied Sciences

Year: 2018

Volume: 13

Issue: 13

Page No. 5096 - 5104

DOI: 10.36478/jeasci.2018.5096.5104

Deep Residual Network for Sound Source Localization in the Time Domain

Authors : Dmitry Suvorov, Ge Dong and Roman Zhukov

References

Aleinik, S., 2017. Acceleration of Zelinski post-filtering calculation. J. Signal Process. Syst., 88: 463-468.
Direct Link  |

Chung, J., C. Gulcehre, K. Cho and Y. Bengio, 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. Proceedings of the NIPS Workshop on Deep Learning, December 8-13, 2014, NIPS, Montreal, Quebec, Canada, pp: 1-9.

Eren, L., 2017. Bearing fault detection by one-dimensional convolutional neural networks. Math. Prob. Eng., 2017: 1-9.
Direct Link  |

Grondin, F. and F. Michaud, 2015. Time difference of arrival estimation based on binary frequency mask for sound source localization on mobile robots. Proceedings of the IEEE-RSJ International Conference on Intelligent Robots and Systems (IROS), September 28-October 2, 2015, IEEE, Hamburg, Germany, ISBN:978-1-4799-9994-1, pp: 6149-6154.

He, K., X. Zhang, S. Ren and J. Sun, 2016. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 26-July 1, 2016, IEEE, Las Vegas, Nevada, USA., ISBN:9781509014385, pp: 770-778.

Ioffe, S. and C. Szegedy, 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning, July 07-09, 2015, Microtome Publishing, Lille, France, pp: 448-456.

Ishi, C.T., O. Chatot, H. Ishiguro and N. Hagita, 2009. Evaluation of a MUSIC-based real-time sound localization of multiple sound sources in real noisy environments. Proceedings of the IEEE-RSJ International Conference on Intelligent Robots and Systems IROS, October 10-15, 2009, IEEE, St. Louis, Missouri, USA., ISBN:978-1-4244-3803-7, pp: 2027-2032.

Kingma, D. and J. Ba, 2015. Adam: A method for stochastic optimization. Proceedings of the International Conference on Learning Representations ICLR, May 7-9, 2015, San Diego, California, USA., pp: 1-15.

Kumatani, K., J. McDonough and B. Raj, 2012. Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors. IEEE. Signal Process. Mag., 29: 127-140.
CrossRef  |  Direct Link  |

Maas, A., A. Hannun and A. Ng, 2013. Rectifier nonlinearities improve neural network acoustic models. Proc. ICML., 30: 1-6.
Direct Link  |

Maaten, L.V.D. and G. Hinton, 2008. Visualizing data using T-SNE. J. Machine Learn. Res., 9: 2579-2605.
Direct Link  |

McGregor, S., 2007. Neural Network Processing for Multiset Data. In: Artificial Neural Networks, Sa, J.M.D., L.A. Alexandre, W. Duch and D. Mandic (Eds.). Springer, Berlin, Germany, ISBN:978-3-540-74689-8, pp: 460-470.

Ronzhin, A. and A. Karpov, 2008. [Comparison of methods for localization of multimodal system user by his speech (In Russian)]. J. Instrum. Eng., 51: 41-47.

Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15: 1929-1958.
Direct Link  |

Suvorov, D. and R. Zhukov, 2017. Device for synchronous data capturing from the array of MEMS microphones with PDM interface (In Russian). IFI CLAIMS Patent Services Company, Madison, Connecticut.

Tashev, I. and A. Acero, 2006. Microphone array post-processor using instantaneous direction of arrival. Proceedings of the International Workshop on Acoustic, Echo and Noise Control, September 12-14, 2006, IWAENC, Paris, France, pp: 1-4.

Tashev, I., 2009. Sound Capture and Processing: Practical Approaches. John Wiley & Sons, New York, USA., ISBN:9780470319833, Pages: 388.

Tzanetakis, G. and P. Cook, 2002. Musical genre classification of audio signals. IEEE Trans. Speech Audio Process., 10: 293-302.
CrossRef  |

Valin, J.M., F. Michaud and J. Rouat, 2007. Robust localization and tracking of simultaneous moving sound sources using beam forming and particle filtering. Rob. Auton. Syst., 55: 216-228.
Direct Link  |

Vary, P. and R. Martin, 2006. Digital Speech Transmission: Enhancement, Coding and Error Concealment. John Wiley & Sons, New York, USA., ISBN-13:978-0-471-56018-9, Pages: 607.

Woelfel, M. and J. McDonough, 2009. Distant Speech Recognition. John Wiley & Sons, New York, USA., ISBN:9780470517048, Pages: 594.

Yalta, N., K. Nakadai and T. Ogata, 2017. Sound source localization using deep learning models. J. Rob. Mech., 29: 37-48.
Direct Link  |

Related Links

Journals By Subject

Journal of Engineering and Applied Sciences

Deep Residual Network for Sound Source Localization in the Time Domain

References