A cache-based method to improve query performance of linked Open Data cloud

Akhtar, Usman; Sant’Anna, Anita; Jihn, Chang-Ho; Razzaq, Muhammad Asif; Bang, Jaehun; Lee, Sungyoung

doi:10.1007/s00607-020-00814-9

A cache-based method to improve query performance of linked Open Data cloud

Regular Paper
Published: 14 May 2020

Volume 102, pages 1743–1763, (2020)
Cite this article

Computing Aims and scope Submit manuscript

Usman Akhtar ORCID: orcid.org/0000-0003-4553-0550¹,
Anita Sant’Anna²,
Chang-Ho Jihn³,
Muhammad Asif Razzaq¹,
Jaehun Bang¹ &
…
Sungyoung Lee¹

371 Accesses
2 Citations
Explore all metrics

Abstract

The proliferation of semantic big data has resulted in a large amount of content published over the Linked Open Data (LOD) cloud. Semantic Web applications consume these data by issuing SPARQL queries. One of the main challenges faced by querying the LOD web cloud on account of the inherent distributed nature of LOD is its high search latency and lack of tools to connect the SPARQL endpoints. In this paper, we propose an Adaptive Cache Replacement strategy (ACR) that aims to accelerate the overall query processing of the LOD cloud. ACR alleviates the burden on SPARQL endpoints by identifying subsequent queries learned from clients historical query patterns and caching the result of these queries. For cache replacement, we propose an exponential smoothing forecasting method to replace the less valuable cache content. In the experimental study, we evaluate the performance of the proposed approach in terms of hit rates, query time and overhead. The proposed approach is found to outperform existing state-of-the-art approaches, increase hit rates by 5.46%, and reduce the query times by 6.34%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date

Towards Content-Aware SPARQL Query Caching for Semantic Web Applications

Caching and Prefetching Strategies for SPARQL Queries

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

References

Basu A (2019) Semantic web, ontology, and linked data. In: Web services: concepts, methodologies, tools, and applications, IGI Global, pp 127–148
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43
Article Google Scholar
Bizer C, Heath T, Berners-Lee T (2009) Linked data-the story so far. Int J Semant Web Inf Syst 5(3):1–22
Article Google Scholar
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, pp 1247–1250
Cho J, Garcia-Molina H (2003) Estimating frequency of change. ACM Trans Internet Technol 3(3):256–290
Article Google Scholar
Chun S, Jung J, Lee KH (2019) Proactive policy for efficiently updating join views on continuous queries over data streams and linked data. IEEE Access 7:86226–86241
Article Google Scholar
Dar S, Franklin MJ, Jonsson BT, Srivastava D, Tan M et al (1996) Semantic data caching and replacement. VLDB 96:330–341
Google Scholar
Denning PJ (1968) The working set model for program behavior. Commun ACM 11(5):323–333
Article MathSciNet Google Scholar
Dividino RQ, Gröner G (2013) Which of the following SPARQL queries are similar? why? In: LD4IE@ ISWC
Fernández JD, Umbrich J, Polleres A, Knuth M (2019) Evaluating query and storage strategies for RDF archives. Semant Web 10(2):247–291
Article Google Scholar
Gardner ES Jr (2006) Exponential smoothing: the state of the art–part ii. Int J Forecast 22(4):637–666
Article Google Scholar
Godfrey P, Gryz J (1999) Answering queries by semantic caches. In: International conference on database and expert systems applications, Springer, pp 485–498
Gottron T (2016) Measuring the accuracy of linked data indices. arXiv preprint arXiv:1603.06068
Gottron T, Knauf M, Scherp A (2015) Analysis of schema structures in the linked open data graph based on unique subject uris, pay-level domains, and vocabulary usage. Distrib Parallel Databases 33(4):515–553
Article Google Scholar
Hasan R (2014) Predicting SPARQL query performance and explaining linked data. In: European semantic web conference, Springer, pp 795–805
Jelenković P, Radovanović A (2003) Optimizing the LRU algorithm for web caching. Charzinski J, Lehnert R, Tran-Gia P (eds) Teletraffic science and engineering, vol 5. Elsevier, pp 191–200, ISSN 1388–3437, ISBN 9780444514554
Konrath M, Gottron T, Staab S, Scherp A (2012) Schemex–efficient construction of a data catalogue by stream-based indexing of linked data. Web Semant Sci Serv Agents World Wide Web 16:52–58
Article Google Scholar
Lee D, Choi J, Kim JH, Noh SH, Min SL, Cho Y, Kim CS (2001) LRFU: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans Comput 50(12):1352–1361
Article MathSciNet Google Scholar
Lehmann J, Bühmann L (2011) Autosparql: let users query your knowledge base. In: Extended semantic web conference, Springer, pp 63–79
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, Van Kleef P, Auer S et al (2015) Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant Web 6(2):167–195
Article Google Scholar
Levandoski JJ, Larson PÅ, Stoica R (2013) Identifying hot and cold data in main-memory databases. In: 2013 IEEE 29th international conference on data engineering (ICDE), IEEE, pp 26–37
Lorey J, Naumann F (2013) Caching and prefetching strategies for SPARQL queries. In: Extended semantic web conference, Springer, pp 46–65
Lorey J, Naumann F (2013) Detecting SPARQL query templates for data prefetching. In: Extended semantic web conference, Springer, pp 124–139
Martin M, Unbehauen J, Auer S (2010) Improving the performance of semantic web applications with SPARQL query caching. In: Extended semantic web conference, Springer, pp 304–318
Nishioka C, Scherp A (2017) Keeping linked open data caches up-to-date by predicting the life-time of RDF triples. In: Proceedings of the international conference on web intelligence, ACM, pp 73–80
Papailiou N, Tsoumakos D, Karras P, Koziris N (2015) Graph-aware, workload-adaptive SPARQL query caching. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 1777–1792
Park HS, Jun CH (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Article Google Scholar
Podlipnig S, Böszörmenyi L (2003) A survey of web cache replacement strategies. ACM Comput Surv 35(4):374–398
Article Google Scholar
Ren Q, Dunham MH, Kumar V (2003) Semantic caching and query processing. IEEE Trans Knowl Data Eng 15(1):192–210
Article Google Scholar
Sanfeliu A, Fu KS (1983) A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Syst Man Cybern 3:353–362
Article Google Scholar
Shu Y, Compton M, Müller H, Taylor K (2013) Towards content-aware SPARQL query caching for semantic web applications. In: International conference on web information systems engineering, Springer, pp 320–329
Suchanek FM, Kasneci G, Weikum G (2007) YAGO: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web, ACM, pp 697–706
Umbrich J, Karnstedt M, Hogan A, Parreira JX (2012) Hybrid SPARQL queries: fresh versus fast results. In: International semantic web conference, Springer, pp 608–624
Yan L, Ma R, Li D, Cheng J (2017) RDF approximate queries based on semantic similarity. Computing 99(5):481–491
Article MathSciNet Google Scholar
Yang M, Wu G (2011) Caching intermediate result of SPARQL queries. In: Proceedings of the 20th international conference companion on World wide web, ACM, pp 159–160
Zhang WE, Sheng QZ, Qin Y, Yao L, Shemshadi A, Taylor K (2016) SECF: Improving SPARQL querying performance with proactive fetching and caching. In: Proceedings of the 31st annual ACM symposium on applied computing, ACM, pp 362–367
Zhang WE, Sheng QZ, Taylor K, Qin Y (2015) Identifying and caching hot triples for efficient RDF query processing. In: International conference on database systems for advanced applications, Springer, pp 259–274
Zhang WE, Sheng QZ, Yao L, Taylor K, Shemshadi A, Qin Y (2018) A learning-based framework for improving querying on web interfaces of curated knowledge bases. ACM Trans Internet Technol 18(3):35
Google Scholar
Zheng W, Zou L, Peng W, Yan X, Song S, Zhao D (2016) Semantic SPARQL similarity search over RDF knowledge graphs. Proc VLDB Endow 9(11):840–851
Article Google Scholar

Download references

Acknowledgements

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-0-01629) supervised by the IITP (Institute for Information & communications Technology Promotion). This work was supported by the Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2017-0-00655), NRF-2016K1A3A7A03951968 & NRF-2019R1A2C2090504.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si, Gyeonggi-do, 446-701, Republic of Korea
Usman Akhtar, Muhammad Asif Razzaq, Jaehun Bang & Sungyoung Lee
Viniam Consulting AB, Halmstad, Sweden
Anita Sant’Anna
Department of Industrial and Management System Engineering, Kyung Hee University, Yongin-si, South Korea
Chang-Ho Jihn

Authors

Usman Akhtar
View author publications
You can also search for this author in PubMed Google Scholar
Anita Sant’Anna
View author publications
You can also search for this author in PubMed Google Scholar
Chang-Ho Jihn
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Asif Razzaq
View author publications
You can also search for this author in PubMed Google Scholar
Jaehun Bang
View author publications
You can also search for this author in PubMed Google Scholar
Sungyoung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jaehun Bang or Sungyoung Lee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akhtar, U., Sant’Anna, A., Jihn, CH. et al. A cache-based method to improve query performance of linked Open Data cloud. Computing 102, 1743–1763 (2020). https://doi.org/10.1007/s00607-020-00814-9

Download citation

Received: 09 July 2019
Accepted: 25 April 2020
Published: 14 May 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00607-020-00814-9

Keywords

Mathematics Subject Classification

68P20

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A cache-based method to improve query performance of linked Open Data cloud

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date

Towards Content-Aware SPARQL Query Caching for Semantic Web Applications

Caching and Prefetching Strategies for SPARQL Queries

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

A cache-based method to improve query performance of linked Open Data cloud

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date

Towards Content-Aware SPARQL Query Caching for Semantic Web Applications

Caching and Prefetching Strategies for SPARQL Queries

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation