SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services

Han, Rui; Zhan, Jianfeng; Luis, Jose Vazquez-Poletti

doi:10.1007/s10766-016-0406-9

SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services

Published: 22 March 2016

Volume 44, pages 1054–1077, (2016)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Rui Han¹,
Jianfeng Zhan¹ &
Jose Vazquez-Poletti Luis²

218 Accesses
1 Citation
Explore all metrics

Abstract

Despite the importance of providing quick responsiveness to user requests for online services, such request processing is very resource expensive when dealing with large-scale service datasets. These often exceed the service providers’ budget when services are deployed on a cloud, in which resources are charged in monetary terms. Providing approximate processing results in request processing is a feasible solution for such problem that trades off result correctness (e.g. prediction or query accuracy) for response time reduction. However, existing techniques in this area either use parts of datasets or skip expensive computations to produce approximate results, thus resulting in large losses in result correctness on a tight resource budget. In this paper, we propose Synopsis-based Approximate Request Processing (SARP), a SARP framework to produce approximate results with small correctness losses even using small amount of resources. To achieve this, SARP conducts computations over synopses, which aggregate the statistical information of the entire service dataset at different approximation levels, based on two key ideas: (1) offline synopsis management that generates and maintains a set of synopses that represent the aggregation information of the dataset at different approximation levels. (2) Online synopsis selection that considers both the current resource allocation and the workload status so as to select the synopsis with the maximal length that can be processed within the required response time. We demonstrate the effectiveness of our approach by testing the recommendation services in e-commerce sites using a large, real-world dataset. Using prediction accuracy as the result correctness metric, the results demonstrate: (i) SARP achieves significant response time reduction with very small correctness losses compared to the exact processing results; (ii) using the same processing time, SARP demonstrates a considerable reduction in correctness loss compared to existing approximation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Privacy-Preserving Data Analytics

Approximate Query Processing with Error Guarantees

References

Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)
Article Google Scholar
Farber, D.: Google’s marissa mayer: speed wins. ZDNet Between the Lines. http://www.zdnet.com/article/googles-marissa-mayer-speed-wins/#! (2006)
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 4 (2009)
Article Google Scholar
Chippa, V.K., Chakradhar, S.T., Roy, K. and Raghunathan, A.: Analysis and characterization of inherent application resilience for approximate computing. In: DAC’13, pp. 113. ACM (2013)
Agarwal, S., Mozafari, B., Panda, A., Milner, H., Madden, S. and Stoica, I.: Blinkdb: queries with bounded errors and bounded response times on very large data. In: EuroSys’13, pp. 29–42. ACM (2013)
Kranen, P., Assent, I., Baldauf, C. and Seidl, T.: Self-adaptive anytime stream clustering. In: ICDM’09, pp. 249–258. IEEE, Piscataway, NJ (2009)
Wu, G., Chang, E., Chen, Y.K. and Hughes, C.: Incremental approximate matrix factorization for speeding up support vector machines. In: SIGKDD’06, pp. 760–766. ACM, New York, NY (2006)
Yang, Y., Webb, G., Korb, K., Ting, K.M.: Classifying under computational resource constraints: anytime classification using probabilistic estimators. Mach. Learn. 69(1), 35–53 (2007)
Article Google Scholar
Zilberstein, S.: Using anytime algorithms in intelligent systems. AI Mag. 17(3), 73 (1996)
Google Scholar
Kirsch, C.M. and Payer, H.: Incorrect systems: it’s not the problem, it’s the solution. In: DAC’12, pp. 913–917. ACM, New York, NY (2012)
Spark. http://spark.apache.org/
Apache mesos. http://mesos.apache.org/
Opennebula. http://opennebula.org/
Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Rev. 37(4), 573–595 (1995)
Article MathSciNet MATH Google Scholar
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Gorrell, G.: Generalized hebbian algorithm for incremental singular value decomposition in natural language processing. In: EACL’06, Citeseer pp. 97–104. Association for Computational Linguistics (ACL), New Brunswick, NJ (2006)
Guttman, A.: R-trees: a dynamic index structure for spatial searching, In: SIGMOD ’84, pp. 47–57. ACM, New York, NY (1984)
Amazon movie reviews dataset. http://snap.stanford.edu/data/web-Movies.html
Pearson, K.: Notes on the history of correlation. Biometrika 13(1), 25–45 (1920)
Article Google Scholar
Apache storm. http://storm.apache.org/
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. ACM SIGOPS Oper. Syst. Rev. 37(5), 164–177 (2003)
Article Google Scholar
Candillier, L., Meyer, F. and Boullé M.: Comparing state-of-the-art collaborative filtering systems. In: Petra Perner (ed.) Machine Learning and Data Mining in Pattern Recognition, pp. 548–562. Springer, Berlin Heidelberg (2007)
Bacigalupo, D.A., Van Hemert, J., Chen, X., Usmani, A., Chester, A.P., He, L., Dillenberger, D.N., Wills, G.B., Gilbert, L., Jarvis, S.A.: Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models. Simul. Model. Pract. Theory 19(6), 1479–1495 (2011)
Article Google Scholar
Movielens 1 million dataset. http://grouplens.org/datasets/movielens/
Netflix dataset. http://www.netflixprize.com/
The million song dataset. http://labrosa.ee.columbia.edu/millionsong/
Han, R., Ghanem, M.M., Guo, L., Guo, Y., Osmond, M.: Enabling cost-aware and adaptive elasticity of multi-tier cloud applications. Future Gener. Comput. Syst. 32, 82–98 (2014)
Article Google Scholar
Han, R., Guo, L., Ghanem, M.M. and Guo, Y.: Lightweight resource scaling for cloud applications. In: CCGrid’12, pp. 644–651. IEEE (2012)
Liu, S., Ren, S., Quan, G., Zhao, M. and Ren, S.: Profit aware load balancing for distributed cloud data centers. In: IPDPS’13, pp. 611–622. IEEE (2013)
Schwiegelshohn, U. and Tchernykh, A.: Online scheduling for cloud computing and different service levels. In: IPDPSW’12, pp. 1067–1074. IEEE (2012)
Luis, J.V.-P., Moreno-Vozmediano, R., Llorente, I.M.: Admission Control in the Cloud: Algorithms for SLA-Based Service Model. In Handbook of Research on Architectural Trends in Service-Driven Computing. IGI Global, Hershey (2014)
Han, R., Ghanem, M. and Guo, Y.: Towards elastic algorithms as a new model of computation for the cloud. Int. J. Next Gener. Comput. 4(2), 82–98. (2013)
He, Y., Elnikety, S. and Sun, H.: Tians scheduling: using partial processing in best-effort applications. In: ICDCS’11, pp. 434–445. IEEE, Piscataway, NJ (2011)
Kranen, P., Seidl, T.: Harnessing the strengths of anytime algorithms for constant data streams. Data Mining Knowl. Discov. 19(2), 245–260 (2009)
Article MathSciNet Google Scholar
Chakrapani, L.N., Muntimadugu, K.K., Lingamneni, A., George, J. and Palem, K.V.: Highly energy and performance efficient embedded computing through approximately correct arithmetic: a mathematical foundation and preliminary experimental validation. In: CASES’08, pp. 187–196. ACM (2008)
Trancoso, P.: Getting ready for approximate computing: trading parallelism for accuracy for dss workloads. In: CF’14, p. 8. ACM, New York, NY (2014)
Gaber, M.M. and Yu P.S.: A framework for resource-aware knowledge discovery in data streams: a holistic approach with its application to clustering. In: SAC’06, pp. 649–656. ACM (2006)
Poladian, V., Sousa, J.P., Garlan, D. and Shaw, M.: Dynamic configuration of resource-aware services. In: ICSE’04, pp. 604–613. IEEE, Piscataway, NJ (2004)
Han, R., Zhan, S., Shao, C., Wang, J., Xu, J., John, L.K., Gang, L. and Wang, L.: Bigdatabench-mt: a benchmark tool for generating realistic mixed data center workloads. In: SoCC’15, vol. 9495, pp. 7–18. Springer, Berlin, Heidelberg, Germany (2015)
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H. and Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: google trace analysis. In: SoCC’12, p. 7. ACM, New York, NY (2012)
Han, R., Wang, J., Huang, S., Shao, C., Zhan, S., Zhan, J. and Vazquez-Poletti, J.L.: Pcs: Predictive component-level scheduling for reducing tail latency in cloud online services. In: ICPP’15, pp. 490–499. IEEE, Piscataway, NJ (2015)
Han, R., Wang, J., Ge, F., Vazquez-Poletti, J.L. and Zhan, J.: Sarp: producing approximate results with small correctness losses for cloud interactive services. In: CF’15, p. 22. ACM, New York, NY (2015)

Download references

Acknowledgments

We sincerely thank Professor Yike Guo, Professor Moustafa M. Ghanem and Dr. Li Guo for their useful comments, and the anonymous reviewers for their feedback on earlier versions of this manuscript. This paper is a significantly extended version of Ref. [42] published in Computing Frontiers 2015. This work is partly supported by National Natural Science Foundation of China under Grant No. 61502451, the Key Project of of Guangdong Province, China under Grant No. 2015B010108006, the Major Program of National Natural Science Foundation of China under Grant No. 61432006, and the Ministerio de Economia y Competitividad from Spain under Grant No. TIN2015-65469-P.

Author information

Authors and Affiliations

Institute Of Computing Technology, Chinese Academy of Sciences, Beijing, China
Rui Han & Jianfeng Zhan
Universidad Complutense de Madrid, Madrid, Spain
Jose Vazquez-Poletti Luis

Authors

Rui Han
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Jose Vazquez-Poletti Luis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Han.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, R., Zhan, J. & Luis, J.VP. SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services. Int J Parallel Prog 44, 1054–1077 (2016). https://doi.org/10.1007/s10766-016-0406-9

Download citation

Received: 23 October 2015
Accepted: 11 March 2016
Published: 22 March 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10766-016-0406-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy-Preserving Data Analytics

Privacy-Preserving Data Analytics

Approximate Query Processing with Error Guarantees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy-Preserving Data Analytics

Privacy-Preserving Data Analytics

Approximate Query Processing with Error Guarantees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation