Abstract
The performance of a perceptual hashing system, which is often measured by discrimination and robustness, is directly related to the features that the system extracts. In this letter, a new speech hashing scheme based on short-time stability is presented. The characteristic of natural speech that the principal components of linear prediction coefficients among neighboring frames tend to be very similar is utilized to generate the hash sequence. Experimental results demonstrate the effectiveness of the proposed scheme in terms of discrimination and robustness.
This work was supported in part by the National Natural Science Foundation of China (No. 60872115), in part by the Shanghai’s Key Discipline Development Program (No.J50104), and in part by the International Cooperation Foundation Program of Shanghai (No. 075107035).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Haitsma, J., Kalker, T.: A highly robust audio fingerprinting system. In: Proc. ISMIR, vol. 2002 (2002)
Menezes, A.J., Van Oorschot, P.C., Vanstone, S.A.: Handbook of applied cryptography. CRC Press, Boca Raton (1997)
Seo, J.S., Haitsma, J., Kalker, T., Yoo, C.D.: A robust image fingerprinting system using the Radon transform. Signal Processing: Image Communication 19(4), 325–339 (2004)
Cano, P., Batlle, E., Kalker, T., Haitsma, J.: A review of audio fingerprinting. The Journal of VLSI Signal Processing 41(3), 271–284 (2005)
Fragoulis, D., Rousopoulos, G., Panagopoulos, T., Alexiou, C., Papaodysseus, C.: On the automated recognition of seriously distorted musical recordings. IEEE Transactions on Signal Processing 49(4), 898–908 (2001)
Ramalingam, A., Krishnan, S.: Gaussian mixture modeling of short-time Fourier transform features for audio fingerprinting. IEEE Transactions on Information Forensics and Security 1(4), 457–463 (2006)
Sethi, I., Kulesh, V., Petrushin, V.: Indexing and retrieval of music via gaussian mixture models. In: Proc. 3rd Int. Workshop on Content Based Multimedia Indexing, vol. 2003 (2003)
Allamanche, E., Herre, J., Hellmuth, O., Froba, B., Cremer, M.: AudioID: Towards content-based identification of audio material. Preprints-Audio Engineering Society (2001)
Park, M., Kim, H., Yang, S.H.: Frequency-temporal filtering for a robust audio fingerprinting scheme in real-noise environments. ETRI journal 28(4), 509–512 (2006)
Özer, H., Sankur, B., Memon, N., Anarim, E.: Perceptual audio hashing functions. EURASIP Journal on Applied Signal Processing 2005(1), 1780–1793 (2005)
Seo, J.S., Jin, M., Lee, S., Jang, D., Lee, S.: Audio fingerprinting based on normalized spectral subband centroids. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings(ICASSP 2005), vol. 3 (2005)
Seo, J.S., Jin, M., Lee, S., Jang, D., Lee, S., Yoo, C.D.: Audio fingerprinting based on normalized spectral subband moments. IEEE Signal Processing Letters 13(4), 209–212 (2006)
Jiao, Y., Yang, B., Li, M., Niu, X.: MDCT-Based Perceptual Hashing for Compressed Audio Content Identification. In: IEEE 9th Workshop on Multimedia Signal Processing, 2007. MMSP 2007, pp. 381–384 (2007)
Jiao, Y., Li, M., Li, Q., Niu, X.: Key-Dependent Compressed Domain Audio Hashing. In: Eighth International Conference on Intelligent Systems Design and Applications, ISDA 2008, vol. 3 (2008)
Niu, X.M., Jiao, Y.H.: An overview of perceptual hashing. Acta Electronica Sinica 36(7), 1405–1411 (2008)
Jiao, Y., Li, Q., Niu, X.: Compressed Domain Perceptual Hashing for MELP Coded Speech. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2008, pp. 410–413 (2008)
Nichols, R.K., Lekkas, P.C.: Wireless security: models, threats, and solutions. McGraw-Hill, New York (2002)
Green, D.R.: The utility of higher-order statistics in gaussian noise suppression. US Government Authored or Collected Report, Naval Postgraduate School, Memory, Calif, USA (2003)
Jackson, L.B.: Digital filters and signal processing. Kluwer Academic Publishers, Dordrecht (1985)
Gersho, A., Gray, R.M.: Vector quantization and signal compression. Springer, Heidelberg (1992)
Swaminathan, A., Mao, Y., Wu, M.: Robust and secure image hashing. IEEE Transactions on Information Forensics and Security 1(2), 215–230 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, N., Wan, WG. (2009). Speech Hashing Algorithm Based on Short-Time Stability. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04277-5_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-04277-5_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04276-8
Online ISBN: 978-3-642-04277-5
eBook Packages: Computer ScienceComputer Science (R0)