10.1515/aoa-2015-0004
Phase Autocorrelation Bark Wavelet Transform (PACWT) Features for Robust Speech Recognition
References
Boll S. (1979), Suppression of acoustic noise in speech using spectral subtraction, IEEE Transactions on Acoustics, Speech and Signal Processing, 27, 2, 113-120.
Chang C. C., Lin C. J. (2011), LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), 2, 3, 27.
Davis S., Mermelstein P. (1980), Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics, Speech and Signal Processing, 28, 4, 357-366.
Ikbal S., Misra H., Hermansky H., Magimai-Doss M. (2012), Phase AutoCorrelation (PAC) features for noise robust speech recognition, Speech Communication, 54,7, 867-880.
Jie Y., Zhenli W. (2009), On the application of variable-step adaptive noise cancelling for improving the robustness of speech recognition, Computing, Communication, Control, and Management, 2009. CCCM 2009. ISECS International Colloquium on, IEEE.
Jolliffe I. (2005), Principal component analysis, Wiley Online Library.
Leonard R. (1984), A database for speaker-independent digit recognition, IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'84., IEEE.
Liu F. H., Stern R. M., Huang X., Acero A. (1993), Efficient cepstral normalization for robust speech recognition, Proceedings of the workshop on Human Language Technology, Association for Computational Linguistics.
Majeed S., Husain H., Samad S., Hussain A. (2012), Hierarchical K-Means Algorithm Applied On Isolated Malay Digit Speech Recognition, International Proceedings of Computer Science & Information Technology, 34, 33- 37.
Mansour D., Juang B. H. (1989), A family of distortion measures based upon projection operation for robust speech recognition, IEEE Transactions on Acoustics, Speech and Signal Processing, 37, 11, 1659-1671.
Nasersharif B. Akbari A. (2007), SNR-dependent compression of enhanced Mel sub-band energies for compensation of noise effects on MFCC features, Pattern recognition letters, 28, 11, 1320-1326.
Nehe N. S., Holambe R. S. (2009), Isolated Word Recognition Using Normalized Teager Energy Cepstral Features, International Conference on Advances in Computing, Control, & Telecommunication Technologies. ACT '09.
Paliwal K., Basu A. (1987), A speech enhancement method based on Kalman filtering, IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE ICASSP'87.
Rabiner L., Juang B. H. (1993), Fundamentals of speech recognition, PTR Prentice-Hall, Inc, Englewood Cliffs, New Jersey , USA.
Sambur M. (1978), Adaptive noise canceling for speech signals, IEEE Transactions on Acoustics, Speech and Signal Processing, 26,5, 419-423.
Shannon B. J., Paliwal K. K. (2006), Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition, Speech Communication, 48, 11, 1458-1485.
Traunmüller H. (1990), Analytical expressions for the tonotopic sensory scale, The Journal of the Acoustical Society of America, 88, 1, 97-100.
Tufekci Z., Gowdy J. (2000), Feature extraction using discrete wavelet transform for speech recognition, Proceedings of the IEEE, Southeastcon 2000.
Vaseghi S. V. (2008), Advanced digital signal processing and noise reduction, Wiley.
Yapanel U., Hansen J. H., Sarikaya R., Pellom B. (2001), Robust digit recognition in noise: an evaluation using the AURORA Corpus, Proc. Eurospeech.
Zhang X., Jiao Z., Zhao Z. (2005), The speech recognition based on the bark wavelet front-end processing, Fuzzy Systems and Knowledge Discovery, Springer, 302-305.
Zhang X., Bai J., Liang W. (2006), The speech recognition system based on bark wavelet MFCC, 8th International Conference on Signal Processing IEEE.
Zhu D., Paliwal K. K. (2004), Product of power spectrum and group delay function for speech recognition, Proceedingson IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04).
DOI: 10.1515/aoa-2015-0004