Abstract
In this paper, we propose an error mitigation scheme which combines two different approaches, a replacement super vector technique which provides replacements to reconstruct both the LPC coefficients and the excitation signal along bursts of lost packets, and a Forward Error Code (FEC) technique in order to minimize the error propagation after the last lost frame. Moreover, this FEC code is embedded into the bitstream in order to avoid the bitrate increment and keep the codec working in a compliant way on clean transmissions. The success of our recovery technique deeply relies on a quantization of the speech parameters (LPC coefficients and the excitation signal), especially in the case of the excitation signal where a modified version of the well-known Linde-Buzo-Gray (LBG) algorithm is applied. The performance of our proposal is evaluated over the AMR codec in terms of speech quality by using the PESQ algorithm. Our proposal achieves a noticeable improvement over the standard AMR legacy codec under adverse channel conditions without incurring neither on high computational costs or delays during the decoding stage nor consuming any additional bitrate.
J.L. Pérez-Córdoba—This work has been supported by an FPI grant from the Spanish Ministry of Education and by the MICINN TEC2013-46690-P project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
3GPP TS 26.090: Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec (1999)
Schroeder, M., Atal, B.: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. IEEE ICASSP 10, 937–940 (1985)
Serizawa, M., Ito, H.: A packet loss recovery method using packet arrived behind the playout time for CELP decoding. IEEE ICASSP 1, 169–172 (2002)
Chibani, M., Lefebvre, R., Gournay, P.: Fast recovery for a CELP-like speech codec after a frame erasure. IEEE Trans. Audio Speech Lang. Process. 15(8), 2485–2495 (2007)
Carmona, J., Pérez-Córdoba, J., Peinado, A., Gomez, A., González, J.: A scalable coding scheme based on interframe dependency limitation. In: IEEE ICASSP, pp. 4805–4808 (2008)
Liao, W., Chen, J., Chen, M.: Adaptive recovery techniques for real-time audio streams. IEEE INFOCOM 2, 815–823 (2001)
Merazka, F.: Packet loss concealment by interpolation for speech over IP network services. In: CIWSP, pp. 1–4 (2013)
Lindbrom, J., Hedelin, P.: Packet loss concealment based on sinusoidal extrapolation. IEEE ICASSP 1, 173–176 (2002)
Hodson, O., Perkins, C., Hardman, V.: A survey of packet loss recovery techniques for streaming audio. IEEE Netw. 12, 40–48 (1998)
Rodbro, C., Murthi, M., Andersen, S., Jensen, S.: Hidden Markov model-based packet loss concealment for voice over IP. IEEE Trans. Audio Speech Lang. Process. 14, 1609–1622 (2006)
López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Residual VQ-quantization for speech frame loss concealment. In: IberSPEECH, November 2014
Zhang, G., Kleijn, W.: Autoregressive model-based speech packet-loss concealment. IEEE ICASSP 1, 4797–4800 (2008)
Ma, Z., Martin, R., Guo, J., Zhang, H.: Nonlinear estimation of missing LSF parameters by a mixture of Dirichlet distributions. In: IEEE ICASSP, pp. 6929–6933, May 2014
Boubakir, C., Berkani, D.: The estimation of line spectral frequencies trajectories based on unscented Kalman filtering. In: International Multi-Conference on Systems, Signals and Devices, pp. 1–6 (2009)
Chazan, D., Hoory, R., Cohen, G., Zibulski, M.: Speech reconstruction from MEL frequency cepstral coefficients and pitch frequency. IEEE ICASSP 3, 1299–1302 (2000)
Merazka, F.: Differential quantization of spectral parameters for CELP based coders in packet networks. In: IECON, pp. 1495–1498, October 2012
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)
Gomez, A., Carmona, J., Peinado, A., Sánchez, V.: A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Audio Speech Lang. Process. 18, 1258–1268 (2010)
Gomez, A., Carmona, J., González, J., Sánchez, V.: One-pulse FEC coding for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Multimedia 13(5), 894–904 (2011)
Ehara, H., Yoshida, K.: Decoder initializing technique for improving frame-erasure resilience of a CELP speech codec. IEEE Trans. Multimedia 10, 549–553 (2008)
Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 57, S35 (1975)
Kondoz, A.: Digital Speech: Coding for Low Bit Rate Communications Systems. Wiley, Hoboken (1994)
Soong, F., Juang, B.: Line spectrum pair (LSP) and speech data compression. IEEE ICASSP 9, 37–40 (1984)
López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Source-based error mitigation for speech transmissions over erasure channels. In: EUSIPCO, pp. 1242–1246, September 2014
Gómez, A., Peinado, A., Sánchez, V., Rubio, A.: A source model mitigation technique for distributd speech recognition over lossy packet channels. In: Proceedings of EUROSPEECH, pp. 2733–2736 (2003)
Geiser, B., Vary, P.: High rate data hiding in ACELP speech codecs. In: IEEE ICASSP, pp. 4005–4008, April 2008
López-Oller, D., Gomez, A.M., Córdoba, J.L.P., Geiser, B., Vary, P.: Steganographic pulse-based recovery for robust ACELP transmission over erasure channels. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 257–266. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35292-8_27
ITU-T Recomendation P.862: Perceptual evaluation of speech quality (PESQ) (2001)
ITU-R BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (2001)
Garofolo, J., et al.: The Structure and Format of the DARPA TIMIT CD-ROM Prototype (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
López-Oller, D., Gomez, A.M., Pérez-Córdoba, J.L. (2016). A Novel Error Mitigation Scheme Based on Replacement Vectors and FEC Codes for Speech Recovery in Loss-Prone Channels. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-49169-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49168-4
Online ISBN: 978-3-319-49169-1
eBook Packages: Computer ScienceComputer Science (R0)