iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/s11042-024-18113-2
Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption | Multimedia Tools and Applications Skip to main content

Advertisement

Log in

Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The development of deep learning technology makes speech retrieval and recognition more accurate and efficient. Meanwhile, the privacy leakage problem of speech data is becoming increasingly prominent, but the emergence of fully homomorphic encryption (FHE) technology can alleviate the concerns about privacy information. In order to protect the privacy of speech data and deep binary hash codes, and realize the privacy-preserving similarity calculation, a secure speech retrieval method using deep hashing and CKKS (Cheon-Kim-Kim-Song) FHE was proposed. Firstly, a speech CKKS FHE scheme is designed to encrypt the original speech data. Then, the spectrogram image features of the original speech data are extracted as the input of triplet convolutional neural network (Tri-CNN) to generate efficient and compact deep binary hash codes, which are encrypted and uploaded to the cloud together with the encrypted speech data. When retrieving, the deep binary hash codes of the querying speech is extracted, encrypted and sent to the cloud server as a search trapdoor, and the security similarity is calculated with the index sequence in the secure index table. The experimental results show that the mean average precision of the proposed method in the TIMIT and THCHS-30 data sets is more than 93%, with a loss of about 2% compared with the plaintext domain, but with higher security.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

Previously reported speech data (THCHS-30 and TIMIT data sets) were used to support this study and are available at 10.48550/arXiv.1512.01882 and 10.1016/0167–6393(90)90,010–7. This is cited at relevant places within the text as reference [25, 26].

References   

  1. Li Y, Ma J, Miao Y et al (2022) Similarity search for encrypted images in secure cloud computing[J]. IEEE Transactions on Cloud Computing 10(2):1142–1155. https://doi.org/10.1109/TCC.2020.2989923

    Article  Google Scholar 

  2. Singh N, Kumar J, Singh A K, et al. Privacy-preserving multi-keyword hybrid search over encrypted data in cloud[J]. Journal of Ambient Intelligence and Humanized Computing, 2022: 1–14. https://doi.org/10.1007/s12652-022-03889-8

  3. Rahulamathavan Y. Privacy-preserving Similarity Calculation of Speaker Features Using Fully Homomorphic Encryption[J]. arXiv preprint arXiv:2202.07994 , 2022. https://doi.org/10.48550/arXiv.2202.07994

  4. Shen M, Cheng G, Zhu L et al (2020) Content-based multi-source encrypted image retrieval in clouds with privacy preservation[J]. Futur Gener Comput Syst 109:621–632. https://doi.org/10.1016/j.future.2018.04.089

    Article  Google Scholar 

  5. Duan Y, Li Y, Lu L et al (2022) A faster outsourced medical image retrieval scheme with privacy preservation[J]. J Syst Archit 122:102356. https://doi.org/10.1016/j.sysarc.2021.102356

    Article  Google Scholar 

  6. Wang Q, Feng C, Xu Y et al (2020) A novel privacy-preserving speech recognition framework using bidirectional LSTM[J]. Journal of Cloud Computing 9(1):1–13. https://doi.org/10.1186/s13677-020-00186-7

    Article  Google Scholar 

  7. Shi C, Wang H, Hu Y et al (2021) A novel NMF-based authentication scheme for encrypted speech in cloud computing[J]. Multimedia Tools and Applications 80(17):25773–25798. https://doi.org/10.1007/s11042-021-10896-y

    Article  Google Scholar 

  8. Wang Y, Huang Y, Zhang R et al (2021) Multi-format speech BioHashing based on energy to zero ratio and improved LP-MMSE parameter fusion[J]. Multimedia Tools and Applications 80(7):10013–10036. https://doi.org/10.1007/s11042-020-09701-z

    Article  Google Scholar 

  9. Zhang S X, Gong Y, Yu D. Encrypted Speech Recognition Using Deep Polynomial Networks[C]. ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).IEEE, Brighton, 2019:5691–5695. https://doi.org/10.1109/ICASSP.2019.8683721

  10. Yu X, Xu C, Dou B et al (2021) Multi-user search on the encrypted multimedia database: lattice-based searchable encryption scheme with time-controlled proxy re-encryption[J]. Multimedia Tools and Applications 80(2):3193–3211. https://doi.org/10.1007/s11042-020-09753-1

    Article  Google Scholar 

  11. Cao R, Zhang Q, Zhu J et al (2020) Enhancing remote sensing image retrieval using a triplet deep metric learning network[J]. Int J Remote Sens 41(2):740–751. https://doi.org/10.1080/2150704X.2019.1647368

    Article  Google Scholar 

  12. Li M, An Z, Wei Q et al (2019) Triplet Deep Hashing with Joint Supervised Loss Based on Deep Neural Networks[J]. Comput Intell Neurosci 2019:1–17. https://doi.org/10.1155/2019/8490364

    Article  Google Scholar 

  13. Jia Y, Chen X, Yu J et al (2021) Speaker recognition based on characteristic spectrograms and an improved self-organizing feature map neural network[J]. Complex & Intelligent Systems 7(4):1749–1757. https://doi.org/10.1007/s40747-020-00172-1

    Article  Google Scholar 

  14. Purwins H, Li B, Virtanen T et al (2019) Deep Learning for Audio Signal Processing[J]. IEEE J Selected Top Signal Process 13(2):206–219. https://doi.org/10.1109/JSTSP.2019.2908700

    Article  Google Scholar 

  15. Cheon J H, Kim A, Kim M, et al. Homomorphic encryption for arithmetic of approximate numbers[C]//International conference on the theory and application of cryptology and information security. Springer, Cham, 2017: 409–437. https://doi.org/10.1007/978-3-319-70694-8_15

  16. Chen C, Jiang D, Peng J et al (2021) Scalable Identity-Oriented Speech Retrieval[J]. IEEE Trans Knowl Data Eng 14(8):1–6. https://doi.org/10.1109/TKDE.2021.3127520

    Article  Google Scholar 

  17. Zhang Q, Li Y, Hu Y (2021) A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing[J]. Multimedia Tools and Applications 80(1):1201–1221. https://doi.org/10.1007/s11042-020-09748-y

    Article  Google Scholar 

  18. Zhang H (2021) Voice keyword retrieval method using attention mechanism and multimodal information fusion[J]. Sci Program 2021(8):1–11. https://doi.org/10.1155/2021/6662841

    Article  Google Scholar 

  19. Yuan Y, Xie L, Leung CC et al (2020) Fast query-by-example speech search using attention-based deep binary embeddings[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28:1988–2000. https://doi.org/10.1109/TASLP.2020.2998277

    Article  Google Scholar 

  20. Zhang Q, Fu M, Huang Y et al (2022) Encrypted Speech Retrieval Scheme Based on Multiuser Searchable Encryption in Cloud Storage[J]. Security and Communication Networks 2022:9045259. https://doi.org/10.1155/2022/9045259

    Article  Google Scholar 

  21. Li W, Chen Y, Hu H et al (2020) Using granule to search privacy preserving voice in home IoT systems[J]. IEEE Access 8:31957–31969. https://doi.org/10.1109/ACCESS.2020.2972975

    Article  Google Scholar 

  22. Li W, Xiao Y, Tang C et al (2020) Multi-user searchable encryption voice in home IoT system[J]. Internet of Things 11:100180. https://doi.org/10.1016/j.iot.2020.100180

    Article  Google Scholar 

  23. Chen J, Chen Z, Zheng P, et al. Encrypted domain mel-frequency cepstral coefficient and fragile audiowatermarking[C]. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, New York, 2018: 68-73. https://doi.org/10.1109/TrustCom/BigDataSE.2018.00021

  24. Thaine P, Penn G (2019) Extracting Mel-Frequency and Bark-Frequency Cepstral Coefficients from Encrypted Signals[C]. INTERSPEECH, Graz 3715–3719. https://doi.org/10.21437/Interspeech.2019-1136

  25. Tang Y, Zhu B, Ma X, et al (2019) Decoding homomorphically encrypted FLAC audio without decryption[C]//ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE 675–679. https://doi.org/10.1109/ICASSP.2019.8682780

  26. Meftah S, Tan BHM, Mun CF et al (2021) Doren: toward efficient deep convolutional neural networks with fully homomorphic encryption[J]. IEEE Trans Inf Forensics Secur 16:3740–3752. https://doi.org/10.1109/TIFS.2021.3090959

    Article  Google Scholar 

  27. Natarajan D, Dalskov A, Kales D, et al (2021) PRIORIS: Enabling Secure Detection of Suicidal Ideation from Speech Using Homomorphic Encryption[M]//Protecting Privacy through Homomorphic Encryption. Springer, Cham 133–146. https://doi.org/10.1007/978-3-030-77287-1_10

  28. Liu J, Wang C, Tu Z et al (2021) Secure KNN classification scheme based on homomorphic encryption for cyberspace[J]. Secur Commun Netw 2021:8759922. https://doi.org/10.1155/2021/8759922

    Article  Google Scholar 

  29. Wang D, Zhang X (2015) Thchs-30: A free chinese speech corpus[J]. arXiv preprint arXiv: 1512.01882. https://doi.org/10.48550/arXiv.1512.01882

  30. Zue V, Seneff S, Glass J (1990) Speech database development at MIT: TIMIT and beyond[J]. Speech Commun 9(4):351–356. https://doi.org/10.1016/0167-6393(90)90010-7

    Article  Google Scholar 

  31. Ullah B, Kamran M, Rui Y (2022) Predictive modeling of short-term rockburst for the stability of subsurface structures using machine learning approaches: T-SNE, K-Means clustering and XGBoost[J]. Mathematics 10(3):449. https://doi.org/10.3390/math10030449

    Article  Google Scholar 

  32. An L, Huang Y, Zhang Q (2022) Verifiable speech retrieval algorithm based on KNN secure hashing[J]. Multimedia Tools and Applications 1–22. https://doi.org/10.1007/s11042-022-13387-w

  33. Zhang Q, Zhao X, Zhang Q et al (2022) Content-based encrypted speech retrieval scheme with deep hashing[J]. Multimed Tools Appl 81(7):10221–10242. https://doi.org/10.1007/s11042-022-12123-8

    Article  Google Scholar 

  34. Huang Y, Wang Y, Li H et al (2022) Encrypted speech retrieval based on long sequence Biohashing[J]. Multimed Tools Appl 81(9):13065–13085. https://doi.org/10.1007/s11042-022-12371-8

    Article  Google Scholar 

  35. Khoirom MS, Laiphrakpam DS, Tuithung T (2021) Audio encryption using ameliorated ElGamal public key encryption over finite field[J]. Wireless Pers Commun 117(2):809–823. https://doi.org/10.1007/s11277-020-07897-9

    Article  Google Scholar 

  36. Shi C, Wang H, Hu Y et al (2019) A Speech Homomorphic Encryption Scheme with Less Data Expansion in Cloud Computing[J]. KSII Trans Internet Inf Syst (TIIS) 13(5):2588–2609. https://doi.org/10.3837/tiis.2019.05.020

    Article  Google Scholar 

  37. Zhang QY, Jia YG (2022) A Speech Fully Homomorphic Encryption Scheme for DGHV Based on Multithreading in Cloud Storage [J]. Int J Netw Secur 24(6):1042–1055. https://doi.org/10.6633/IJNS.20221124(6).09

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61862041). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu-yu Zhang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Qy., Wen, Yw., Huang, Yb. et al. Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption. Multimed Tools Appl 83, 67469–67500 (2024). https://doi.org/10.1007/s11042-024-18113-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-024-18113-2

Keywords

Navigation