Abstract
The proliferation of semantic big data has resulted in a large amount of content published over the Linked Open Data (LOD) cloud. Semantic Web applications consume these data by issuing SPARQL queries. One of the main challenges faced by querying the LOD web cloud on account of the inherent distributed nature of LOD is its high search latency and lack of tools to connect the SPARQL endpoints. In this paper, we propose an Adaptive Cache Replacement strategy (ACR) that aims to accelerate the overall query processing of the LOD cloud. ACR alleviates the burden on SPARQL endpoints by identifying subsequent queries learned from clients historical query patterns and caching the result of these queries. For cache replacement, we propose an exponential smoothing forecasting method to replace the less valuable cache content. In the experimental study, we evaluate the performance of the proposed approach in terms of hit rates, query time and overhead. The proposed approach is found to outperform existing state-of-the-art approaches, increase hit rates by 5.46%, and reduce the query times by 6.34%.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Basu A (2019) Semantic web, ontology, and linked data. In: Web services: concepts, methodologies, tools, and applications, IGI Global, pp 127–148
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43
Bizer C, Heath T, Berners-Lee T (2009) Linked data-the story so far. Int J Semant Web Inf Syst 5(3):1–22
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, pp 1247–1250
Cho J, Garcia-Molina H (2003) Estimating frequency of change. ACM Trans Internet Technol 3(3):256–290
Chun S, Jung J, Lee KH (2019) Proactive policy for efficiently updating join views on continuous queries over data streams and linked data. IEEE Access 7:86226–86241
Dar S, Franklin MJ, Jonsson BT, Srivastava D, Tan M et al (1996) Semantic data caching and replacement. VLDB 96:330–341
Denning PJ (1968) The working set model for program behavior. Commun ACM 11(5):323–333
Dividino RQ, Gröner G (2013) Which of the following SPARQL queries are similar? why? In: LD4IE@ ISWC
Fernández JD, Umbrich J, Polleres A, Knuth M (2019) Evaluating query and storage strategies for RDF archives. Semant Web 10(2):247–291
Gardner ES Jr (2006) Exponential smoothing: the state of the art–part ii. Int J Forecast 22(4):637–666
Godfrey P, Gryz J (1999) Answering queries by semantic caches. In: International conference on database and expert systems applications, Springer, pp 485–498
Gottron T (2016) Measuring the accuracy of linked data indices. arXiv preprint arXiv:1603.06068
Gottron T, Knauf M, Scherp A (2015) Analysis of schema structures in the linked open data graph based on unique subject uris, pay-level domains, and vocabulary usage. Distrib Parallel Databases 33(4):515–553
Hasan R (2014) Predicting SPARQL query performance and explaining linked data. In: European semantic web conference, Springer, pp 795–805
Jelenković P, Radovanović A (2003) Optimizing the LRU algorithm for web caching. Charzinski J, Lehnert R, Tran-Gia P (eds) Teletraffic science and engineering, vol 5. Elsevier, pp 191–200, ISSN 1388–3437, ISBN 9780444514554
Konrath M, Gottron T, Staab S, Scherp A (2012) Schemex–efficient construction of a data catalogue by stream-based indexing of linked data. Web Semant Sci Serv Agents World Wide Web 16:52–58
Lee D, Choi J, Kim JH, Noh SH, Min SL, Cho Y, Kim CS (2001) LRFU: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans Comput 50(12):1352–1361
Lehmann J, Bühmann L (2011) Autosparql: let users query your knowledge base. In: Extended semantic web conference, Springer, pp 63–79
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, Van Kleef P, Auer S et al (2015) Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant Web 6(2):167–195
Levandoski JJ, Larson PÅ, Stoica R (2013) Identifying hot and cold data in main-memory databases. In: 2013 IEEE 29th international conference on data engineering (ICDE), IEEE, pp 26–37
Lorey J, Naumann F (2013) Caching and prefetching strategies for SPARQL queries. In: Extended semantic web conference, Springer, pp 46–65
Lorey J, Naumann F (2013) Detecting SPARQL query templates for data prefetching. In: Extended semantic web conference, Springer, pp 124–139
Martin M, Unbehauen J, Auer S (2010) Improving the performance of semantic web applications with SPARQL query caching. In: Extended semantic web conference, Springer, pp 304–318
Nishioka C, Scherp A (2017) Keeping linked open data caches up-to-date by predicting the life-time of RDF triples. In: Proceedings of the international conference on web intelligence, ACM, pp 73–80
Papailiou N, Tsoumakos D, Karras P, Koziris N (2015) Graph-aware, workload-adaptive SPARQL query caching. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 1777–1792
Park HS, Jun CH (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Podlipnig S, Böszörmenyi L (2003) A survey of web cache replacement strategies. ACM Comput Surv 35(4):374–398
Ren Q, Dunham MH, Kumar V (2003) Semantic caching and query processing. IEEE Trans Knowl Data Eng 15(1):192–210
Sanfeliu A, Fu KS (1983) A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Syst Man Cybern 3:353–362
Shu Y, Compton M, Müller H, Taylor K (2013) Towards content-aware SPARQL query caching for semantic web applications. In: International conference on web information systems engineering, Springer, pp 320–329
Suchanek FM, Kasneci G, Weikum G (2007) YAGO: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web, ACM, pp 697–706
Umbrich J, Karnstedt M, Hogan A, Parreira JX (2012) Hybrid SPARQL queries: fresh versus fast results. In: International semantic web conference, Springer, pp 608–624
Yan L, Ma R, Li D, Cheng J (2017) RDF approximate queries based on semantic similarity. Computing 99(5):481–491
Yang M, Wu G (2011) Caching intermediate result of SPARQL queries. In: Proceedings of the 20th international conference companion on World wide web, ACM, pp 159–160
Zhang WE, Sheng QZ, Qin Y, Yao L, Shemshadi A, Taylor K (2016) SECF: Improving SPARQL querying performance with proactive fetching and caching. In: Proceedings of the 31st annual ACM symposium on applied computing, ACM, pp 362–367
Zhang WE, Sheng QZ, Taylor K, Qin Y (2015) Identifying and caching hot triples for efficient RDF query processing. In: International conference on database systems for advanced applications, Springer, pp 259–274
Zhang WE, Sheng QZ, Yao L, Taylor K, Shemshadi A, Qin Y (2018) A learning-based framework for improving querying on web interfaces of curated knowledge bases. ACM Trans Internet Technol 18(3):35
Zheng W, Zou L, Peng W, Yan X, Song S, Zhao D (2016) Semantic SPARQL similarity search over RDF knowledge graphs. Proc VLDB Endow 9(11):840–851
Acknowledgements
This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-0-01629) supervised by the IITP (Institute for Information & communications Technology Promotion). This work was supported by the Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2017-0-00655), NRF-2016K1A3A7A03951968 & NRF-2019R1A2C2090504.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Akhtar, U., Sant’Anna, A., Jihn, CH. et al. A cache-based method to improve query performance of linked Open Data cloud. Computing 102, 1743–1763 (2020). https://doi.org/10.1007/s00607-020-00814-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-020-00814-9