Abstract
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic knowledge into Automatic Speech Recognition (ASR) systems design has been proven a viable approach to rise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as detectors for manner of articulation attributes starting from representations of speech signal frames. In this paper, a set of six detectors for the above mentioned attributes is designed based on the E-αNet model of neural networks. This model was chosen for its capability to learn hidden activation functions that results in better generalization properties. Experimental set-up and results are presented that show an average 3.5% improvement over a baseline neural network implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Haykin, S.: Neural Networks: a Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)
Kirchhoff, K.: Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. In: Proc. ICSLP 1998, Sydney, Australia (1998)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllable word recognition in continuously spoken sentences. IEEE Trans. On Acoust., Speech and Signal Process. 28(4), 357–366 (1980)
Lee, K.F., Hon, H.W.: Speaker-independent phone recognition using hiddenMarkov models. IEEE Trans. On Acoust., Speech and Signal Process. 37(11), 1641–1648 (1989)
Li, J., Tsao, Y., Lee, C.-H.: A study on knowledge source integration for candidate rescoring in automatic speech recognition. In: Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, pp. 891–894 (December 1998)
Lippmann, R.P.: Speech recognition by machines and humans. Speech Communication 22(1), 1–15 (1997)
Gaglio, S., Pilato, G., Sorbello, F., Vassallo, G.: Using the Hermite Regression Formula to Design a Neural Architecture with Automatic Learning of the Hidden Activation Functions. In: Lamma, E., Mello, P. (eds.) AI*IA 1999. LNCS (LNAI), vol. 1792, pp. 226–237. Springer, Heidelberg (2000)
Pilato, G., Sorbello, F., Vassallo, G.: An Innovative Way to Measure the Quality of a Neural Network without the Use of the Test Set. IJACI International Journal of Advanced Computational Intelligence 5(1), 31–36 (2001)
Cirasa, A., Pilato, G., Sorbello, F., Vassallo, G.: An Enhanced Version of the aNet Architecture: Automatic Pruning of the Hermite Orthonormal Functions. In: Atti del Workshop Apprendimento e Percezione nei Sistemi Robotici, Parma, Italy, pp. 29–30 (1999)
Lee, C.-H.: From knowledge-ignorant to knowledge-rich modeling: a new speech research paradigm for next generation automatic speech recognition. In: Proc. ICSLP (2004)
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus. U.S. Dept. of Commerce, NIST, Gaithersburg, MD (February 1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siniscalchi, S.M. et al. (2006). Application of EαNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds) Neural Nets. WIRN NAIS 2005 2005. Lecture Notes in Computer Science, vol 3931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731177_21
Download citation
DOI: https://doi.org/10.1007/11731177_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33183-4
Online ISBN: 978-3-540-33184-1
eBook Packages: Computer ScienceComputer Science (R0)