Abstract
The query error correction task is very important to improve user satisfaction and quality of query results. In traditional query error correction methods researchers mostly use a pipeline way to correct the error step by step. They rely heavily on manual annotation corpora. It is difficult to take into account the global effect. In this paper, we present a character-based end-to-end Sequence to Sequence (Seq2Seq) method with attention mechanism. It also incorporates the neural network language model trained on unlabeled corpora to solve the task of query correction. It can unify the modeling of different error types in query error correction and effectively overcome the shortcomings of traditional methods in query error correction tasks. Experiments show that this method can effectively capture the long-distance knowledge to correct errors, and through the Simple Recurrent Unit (SRU) it can be as good as Long Short-Term Memory (LSTM). However, there has been a significant improvement in processing time. This point is very important in query error correction tasks.
This work was supported by the National Natural Science Foundation of China (61672040) and the North China University of Technology Startup Fund.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 286–293. Association for Computational Linguistics (2000)
Chen, Q., Li, M., Zhou, M.: Improving query spelling correction using web search results. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Cucerzan, S., Brill, E.: Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (2004)
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)
Ganjisaffar, Y., et al.: qSpell: spelling correction of web search queries using ranking models and iterative correction. In: Spelling Alteration for Web Search Workshop, p. 15 (2011)
Gao, J., Li, X., Micol, D., Quirk, C., Sun, X.: A large scale ranker-based system for search query spelling correction. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 358–366. Association for Computational Linguistics (2010)
Gulcehre, C., et al.: On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535 (2015)
Hagen, M., Potthast, M., Gohsen, M., Rathgeber, A., Stein, B.: A large-scale query spelling correction corpus. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1261–1264. ACM (2017)
Hinton, G., Srivastava, N., Swersky, K.: Rmsprop: Divide the gradient by a running average of its recent magnitude. Neural Networks for Machine Learning, Coursera lecture 6e (2012)
Kernighan, M.D., Church, K.W., Gale, W.A.: A spelling correction program based on a noisy channel model. In: Proceedings of the 13th Conference on Computational Linguistics, vol. 2, pp. 205–210. Association for Computational Linguistics (1990)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, M., Zhang, Y., Zhu, M., Zhou, M.: Exploring distributional similarity based models for query spelling correction. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1025–1032. Association for Computational Linguistics (2006)
Li, Y., Duan, H., Zhai, C.: Cloudspeller: query spelling correction by using a Unified Hidden Markov model with web-scale resources. In: Proceedings of the 21st International Conference on World Wide Web, pp. 561–562. ACM (2012)
Li, Y., Duan, H., Zhai, C.: A generalized hidden Markov model with discriminative training for query spelling correction. In: Proceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 611–620. ACM (2012)
Luec, G.: A data-driven approach for correcting search quaries. In: Spelling Alteration for Web Search Workshop, p. 6 (2011)
Mandt, S., Hoffman, M.D., Blei, D.M.: Stochastic gradient descent as approximate Bayesian inference. arXiv preprint arXiv:1704.04289 (2017)
Mays, E., Damerau, F.J., Mercer, R.L.: Context based spelling correction. Inf. Process. Manag. 27(5), 517–522 (1991)
Merity, S., Keskar, N.S., Socher, R.: Regularizing and optimizing LSTM language models. arXiv preprint arXiv:1708.02182 (2017)
Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 239–248. ACM (2005)
Sriram, A., Jun, H., Satheesh, S., Coates, A.: Cold fusion: training Seq2Seq models together with language models. arXiv preprint arXiv:1708.06426 (2017)
Sun, X., Gao, J., Micol, D., Quirk, C.: Learning phrase-based spelling error models from clickthrough data. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 266–274. Association for Computational Linguistics (2010)
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)
Wang, K., Pedersen, J.: Review of MSR-Bing web scale speller challenge. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1339–1340. ACM (2011)
Whitelaw, C., Hutchinson, B., Chung, G.Y., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2, vol. 2, pp. 890–899. Association for Computational Linguistics (2009)
Yang, Z., Dai, Z., Salakhutdinov, R., Cohen, W.W.: Breaking the softmax bottleneck: a high-rank RNN language model. arXiv preprint arXiv:1711.03953 (2017)
Zhang, Y., He, P., Xiang, W., Li, M.: Discriminative reranking for spelling correction. In: Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation, pp. 64–71 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Duan, J., Ji, T., Wu, M., Wang, H. (2019). Query Error Correction Algorithm Based on Fusion Sequence to Sequence Model. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11684. Springer, Cham. https://doi.org/10.1007/978-3-030-28374-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-28374-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28373-5
Online ISBN: 978-3-030-28374-2
eBook Packages: Computer ScienceComputer Science (R0)