Abstract
Despite the importance of providing quick responsiveness to user requests for online services, such request processing is very resource expensive when dealing with large-scale service datasets. These often exceed the service providers’ budget when services are deployed on a cloud, in which resources are charged in monetary terms. Providing approximate processing results in request processing is a feasible solution for such problem that trades off result correctness (e.g. prediction or query accuracy) for response time reduction. However, existing techniques in this area either use parts of datasets or skip expensive computations to produce approximate results, thus resulting in large losses in result correctness on a tight resource budget. In this paper, we propose Synopsis-based Approximate Request Processing (SARP), a SARP framework to produce approximate results with small correctness losses even using small amount of resources. To achieve this, SARP conducts computations over synopses, which aggregate the statistical information of the entire service dataset at different approximation levels, based on two key ideas: (1) offline synopsis management that generates and maintains a set of synopses that represent the aggregation information of the dataset at different approximation levels. (2) Online synopsis selection that considers both the current resource allocation and the workload status so as to select the synopsis with the maximal length that can be processed within the required response time. We demonstrate the effectiveness of our approach by testing the recommendation services in e-commerce sites using a large, real-world dataset. Using prediction accuracy as the result correctness metric, the results demonstrate: (i) SARP achieves significant response time reduction with very small correctness losses compared to the exact processing results; (ii) using the same processing time, SARP demonstrates a considerable reduction in correctness loss compared to existing approximation techniques.
Similar content being viewed by others
References
Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)
Farber, D.: Google’s marissa mayer: speed wins. ZDNet Between the Lines. http://www.zdnet.com/article/googles-marissa-mayer-speed-wins/#! (2006)
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 4 (2009)
Chippa, V.K., Chakradhar, S.T., Roy, K. and Raghunathan, A.: Analysis and characterization of inherent application resilience for approximate computing. In: DAC’13, pp. 113. ACM (2013)
Agarwal, S., Mozafari, B., Panda, A., Milner, H., Madden, S. and Stoica, I.: Blinkdb: queries with bounded errors and bounded response times on very large data. In: EuroSys’13, pp. 29–42. ACM (2013)
Kranen, P., Assent, I., Baldauf, C. and Seidl, T.: Self-adaptive anytime stream clustering. In: ICDM’09, pp. 249–258. IEEE, Piscataway, NJ (2009)
Wu, G., Chang, E., Chen, Y.K. and Hughes, C.: Incremental approximate matrix factorization for speeding up support vector machines. In: SIGKDD’06, pp. 760–766. ACM, New York, NY (2006)
Yang, Y., Webb, G., Korb, K., Ting, K.M.: Classifying under computational resource constraints: anytime classification using probabilistic estimators. Mach. Learn. 69(1), 35–53 (2007)
Zilberstein, S.: Using anytime algorithms in intelligent systems. AI Mag. 17(3), 73 (1996)
Kirsch, C.M. and Payer, H.: Incorrect systems: it’s not the problem, it’s the solution. In: DAC’12, pp. 913–917. ACM, New York, NY (2012)
Spark. http://spark.apache.org/
Apache mesos. http://mesos.apache.org/
Opennebula. http://opennebula.org/
Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Rev. 37(4), 573–595 (1995)
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Gorrell, G.: Generalized hebbian algorithm for incremental singular value decomposition in natural language processing. In: EACL’06, Citeseer pp. 97–104. Association for Computational Linguistics (ACL), New Brunswick, NJ (2006)
Guttman, A.: R-trees: a dynamic index structure for spatial searching, In: SIGMOD ’84, pp. 47–57. ACM, New York, NY (1984)
Amazon movie reviews dataset. http://snap.stanford.edu/data/web-Movies.html
Pearson, K.: Notes on the history of correlation. Biometrika 13(1), 25–45 (1920)
Apache storm. http://storm.apache.org/
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. ACM SIGOPS Oper. Syst. Rev. 37(5), 164–177 (2003)
Candillier, L., Meyer, F. and Boullé M.: Comparing state-of-the-art collaborative filtering systems. In: Petra Perner (ed.) Machine Learning and Data Mining in Pattern Recognition, pp. 548–562. Springer, Berlin Heidelberg (2007)
Bacigalupo, D.A., Van Hemert, J., Chen, X., Usmani, A., Chester, A.P., He, L., Dillenberger, D.N., Wills, G.B., Gilbert, L., Jarvis, S.A.: Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models. Simul. Model. Pract. Theory 19(6), 1479–1495 (2011)
Movielens 1 million dataset. http://grouplens.org/datasets/movielens/
Netflix dataset. http://www.netflixprize.com/
The million song dataset. http://labrosa.ee.columbia.edu/millionsong/
Han, R., Ghanem, M.M., Guo, L., Guo, Y., Osmond, M.: Enabling cost-aware and adaptive elasticity of multi-tier cloud applications. Future Gener. Comput. Syst. 32, 82–98 (2014)
Han, R., Guo, L., Ghanem, M.M. and Guo, Y.: Lightweight resource scaling for cloud applications. In: CCGrid’12, pp. 644–651. IEEE (2012)
Liu, S., Ren, S., Quan, G., Zhao, M. and Ren, S.: Profit aware load balancing for distributed cloud data centers. In: IPDPS’13, pp. 611–622. IEEE (2013)
Schwiegelshohn, U. and Tchernykh, A.: Online scheduling for cloud computing and different service levels. In: IPDPSW’12, pp. 1067–1074. IEEE (2012)
Luis, J.V.-P., Moreno-Vozmediano, R., Llorente, I.M.: Admission Control in the Cloud: Algorithms for SLA-Based Service Model. In Handbook of Research on Architectural Trends in Service-Driven Computing. IGI Global, Hershey (2014)
Han, R., Ghanem, M. and Guo, Y.: Towards elastic algorithms as a new model of computation for the cloud. Int. J. Next Gener. Comput. 4(2), 82–98. (2013)
He, Y., Elnikety, S. and Sun, H.: Tians scheduling: using partial processing in best-effort applications. In: ICDCS’11, pp. 434–445. IEEE, Piscataway, NJ (2011)
Kranen, P., Seidl, T.: Harnessing the strengths of anytime algorithms for constant data streams. Data Mining Knowl. Discov. 19(2), 245–260 (2009)
Chakrapani, L.N., Muntimadugu, K.K., Lingamneni, A., George, J. and Palem, K.V.: Highly energy and performance efficient embedded computing through approximately correct arithmetic: a mathematical foundation and preliminary experimental validation. In: CASES’08, pp. 187–196. ACM (2008)
Trancoso, P.: Getting ready for approximate computing: trading parallelism for accuracy for dss workloads. In: CF’14, p. 8. ACM, New York, NY (2014)
Gaber, M.M. and Yu P.S.: A framework for resource-aware knowledge discovery in data streams: a holistic approach with its application to clustering. In: SAC’06, pp. 649–656. ACM (2006)
Poladian, V., Sousa, J.P., Garlan, D. and Shaw, M.: Dynamic configuration of resource-aware services. In: ICSE’04, pp. 604–613. IEEE, Piscataway, NJ (2004)
Han, R., Zhan, S., Shao, C., Wang, J., Xu, J., John, L.K., Gang, L. and Wang, L.: Bigdatabench-mt: a benchmark tool for generating realistic mixed data center workloads. In: SoCC’15, vol. 9495, pp. 7–18. Springer, Berlin, Heidelberg, Germany (2015)
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H. and Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: google trace analysis. In: SoCC’12, p. 7. ACM, New York, NY (2012)
Han, R., Wang, J., Huang, S., Shao, C., Zhan, S., Zhan, J. and Vazquez-Poletti, J.L.: Pcs: Predictive component-level scheduling for reducing tail latency in cloud online services. In: ICPP’15, pp. 490–499. IEEE, Piscataway, NJ (2015)
Han, R., Wang, J., Ge, F., Vazquez-Poletti, J.L. and Zhan, J.: Sarp: producing approximate results with small correctness losses for cloud interactive services. In: CF’15, p. 22. ACM, New York, NY (2015)
Acknowledgments
We sincerely thank Professor Yike Guo, Professor Moustafa M. Ghanem and Dr. Li Guo for their useful comments, and the anonymous reviewers for their feedback on earlier versions of this manuscript. This paper is a significantly extended version of Ref. [42] published in Computing Frontiers 2015. This work is partly supported by National Natural Science Foundation of China under Grant No. 61502451, the Key Project of of Guangdong Province, China under Grant No. 2015B010108006, the Major Program of National Natural Science Foundation of China under Grant No. 61432006, and the Ministerio de Economia y Competitividad from Spain under Grant No. TIN2015-65469-P.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Han, R., Zhan, J. & Luis, J.VP. SARP: Synopsis-Based Approximate Request Processing for Low Latency and Small Correctness Loss in Cloud Online Services. Int J Parallel Prog 44, 1054–1077 (2016). https://doi.org/10.1007/s10766-016-0406-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-016-0406-9