Abstract
Query optimizers in object-relational database management systems require users to provide the execution cost models of user-defined functions(UDFs). Despite this need, however, there has been little work done to provide such a model. Furthermore, none of the existing work is self-tuning and, therefore, cannot adapt to changing UDF execution patterns. This paper addresses this problem by introducing a self-tuning cost modeling approach based on the quadtree. The quadtree has the inherent desirable properties to (1) perform fast retrievals, (2) allow for fast incremental updates (without storing individual data points), and (3) store information at different resolutions. We take advantage of these properties of the quadtree and add the following in order to make the quadtree useful for UDF cost modeling: the abilities to (1) adapt to changing UDF execution patterns and (2) use limited memory. To this end, we have developed a novel technique we call the memory-limited quadtree(MLQ). In MLQ, each instance of UDF execution is mapped to a query point in a multi-dimensional space. Then, a prediction is made at the query point, and the actual value at the point is inserted as a new data point. The quadtree is then used to store summary information of the data points at different resolutions based on the distribution of the data points. This information is used to make predictions, guide the insertion of new data points, and guide the compression of the quadtree when the memory limit is reached. We have conducted extensive performance evaluations comparing MLQ with the existing (static) approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hellerstein, J., Stonebraker, M.: Predicate migration: Optimizing queries with expensive predicates. In: Proc. of ACM-SIGMOD, pp. 267–276 (1993)
Chaudhuri, S., Shim, K.: Optimization of queries with user-defined predicates. In: Proc. of ACM SIGMOD, pp. 87–98 (1996)
Jihad, B., Kinji, O.: Cost estimation of user-defined methods in object-relational database systems. SIGMOD Record, 22–28 (1999)
Boulos, J., Viemont, Y., Ono, K.: A neural network approach for query cost evaluation. Trans. on Information Processing Society of Japan, 2566–2575 (1997)
Hellerstein, J.: Practical predicate placement. In: Proc. of ACM SIGMOD, pp. 325–335 (1994)
Aboulnaga, A., Chaudhuri, S.: Self-tuning histograms: building histograms without looking at data. In: Proc. of ACM SIGMOD, pp. 181–192 (1999)
Bruno, N., Chaudhuri, S., Gravano, L.: STHoles: A mulidimensional workloadaware histogram. In: Proc. of ACM SIGMOD, pp. 211–222 (2001)
Stillger, M., Lohman, G., Markl, V., Kandil, M.: LEO - DB2’s LEarning optimizer. In: Proc. of VLDB, pp. 19–28 (2001)
Hunter, G.M., Steiglitz, K.: Operations on images using quadtrees. IEEE Trans. on Pattern Analysis and Machine Intelligence 1, 145–153 (1979)
Strobach, P.: Quadtree-structured linear prediction models for image sequence processing. IEEE Trans. on Pattern Analysis and Machine Intelligence 11, 742–748
Lee, J.W.: Joint optimization of block size and quantization for quadtree-based motion estimation. IEEE Trans. on Pattern Analysis 7, 909–911 (1998)
Aref, W.G., Samet, H.: Efficient window block retrieval in quadtree-based spatial databases. GeoInformatica 1, 59–91 (1997)
Wang, F.: Relational-linear quadtree approach for two-dimensional spatial representation and manipulation. IEEE Trans. on Knowledge and Data Eng. 3, 118–122 (1991)
Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multi-resolution tree structure. In: Proc. of ACM SIGMOD, pp. 401–413 (2001)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. ch. 7, vol. 303, pp. 314–315. Morgan Kaufmann, San Francisco (2001)
Poosala, V., Ioannidis, Y.: Selectivity estimation without the attribute value independence assumption. In: Proc. of VLDB, pp. 486–495 (1997)
Buccafurri, F., Furfaro, F., Sacca, D., Sirangelo, C.: A quad-tree based multiresolution approach for two-dimensional summary data. In: Proc. of SSDBM, Cambridge, Massachusetts, USA (2003)
He, Z., Lee, B.S., Snapp, R.R.: Self-tuning UDF cost modeling using the memory limited quadtree. Technical Report CS-03-18, Department of Computer Science, University of Vermont (2003)
Deshpande, A., Garofalakis, M., Rastogi, R.: Independence is good: Dependency-based histogram synopses for high-dimensional data. In: Proc. of ACM SIGMOD, pp. 199–210 (2001)
Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley, Reading (1949)
PSADA: Urban areas of pennsylvania state, http://www.pasda.psu.edu/access/urban.shtml (Last viewed:June 18, 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, Z., Lee, B.S., Snapp, R.R. (2004). Self-tuning UDF Cost Modeling Using the Memory-Limited Quadtree. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-24741-8_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21200-3
Online ISBN: 978-3-540-24741-8
eBook Packages: Springer Book Archive