Abstract
Frequency Domain Linear Prediction (FDLP) represents a technique for auto-regressive modelling of Hilbert envelopes of a signal. In this paper, we propose a speech coding technique that uses FDLP in Quadrature Mirror Filter (QMF) sub-bands of short segments of the speech signal (25 ms). Line Spectral Frequency parameters related to autoregressive models and the spectral components of the residual signals are transmitted. For simulating the effects of lossy transmission channels, bit-packets are dropped randomly. In the objective and subjective quality evaluations, the proposed FDLP speech codec is judged to be more resilient to bit-packet losses compared to the state-of-the-art Adaptive Multi-Rate Wide-Band (AMR-WB) codec at 12 kbps.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Schroeder, M.R., Atal, B.S.: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. In: Proc. of the ICASSP, April 1985, vol. 10, pp. 937–940 (1985)
Enhanced aacPlus General Audio Codec, 3GPP TS 26.401
Athineos, M., Ellis, D.: Autoregressive Modeling of Temporal Envelopes. IEEE Trans. on Signal Proc. 55, 5237–5245 (2007)
Kumerasan, R., Rao, A.: Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications. Journal of Acoustical Society of America 105(3), 1912–1924 (1999)
Herre, J., Johnston, J.D.: Enhancing the Performance of Perceptual Audio Coders by using Temporal Noise Shaping (TNS). In: Proc. of 101st AES Conv., Los Angeles, USA, pp. 1–24 (1996)
Makhoul, J.: Linear Prediction: A Tutorial Review. Proc. of the IEEE 63(4), 561–580 (1975)
Motlicek, P., Ganapathy, S., Hermansky, H., Garudadri, H.: Frequency domain linear prediction for QMF sub-bands and applications to audio coding. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds.) MLMI 2007. LNCS, vol. 4892, pp. 248–258. Springer, Heidelberg (2008)
Marple, L.S.: Computing the Discrete-Time Analytic Signal via FFT. IEEE Trans. on Acoustics, Speech and Signal Proc. 47, 2600–2603 (1999)
Fisher, W.M., et al.: The DARPA speech recognition research database: specifications and status. In: Proc. DARPA Workshop on Speech Recognition, February 1986, pp. 93–99 (1986)
ITU-T Rec. P.862: Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-end Speech Quality Assessment of Narrowband Telephone Networks
Extended AMR Wideband codec, http://www.3gpp.org/ftp/Specs/html-info/26290.htm
Hirsch, H.G., Finster, H.: The Simulation of Realistic Acoustic Input Scenarios for Speech Recognition Systems. In: Proc. of Interspeech, September 2005, pp. 2697–3000 (2005)
ITU-R BS.1284-1: General methods for the subjective assessment of sound quality (2003)
ISO/IEC JTC1/SC29/WG11: Framework for Exploration of Speech and Audio Coding, MPEG2007/N9254, Lausanne, CH (July 2007)
Voice Age, http://www.voiceage.com/audiosamples.php
ITU-R Recommendation BS.1534: Method for the subjective assessment of intermediate audio quality (June 2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ganapathy, S., Motlicek, P., Hermansky, H. (2009). Error Resilient Speech Coding Using Sub-band Hilbert Envelopes . In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-04208-9_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)