Abstract
This paper proposes an approach to representing and querying semistructured Web data. The proposed approach is based on nested tables, which may have internal nested structural variations to accommodate semistructured data. Our motivation is to reduce the complexity found in typical query languages for semistructured data and to provide users with an alternative for quickly querying data obtained from multiple-record Web pages. We show the feasibility of our proposal by developing a prototype for a graphical query interface called QSByE (Querying Semistructured data By Example). For QSByE, we define a particular variation of nested tables and propose a set of QBE-like operations that extends typical nested-relational-algebra operations to handle semistructured data. We show examples of how users can pose interesting queries using QSByE.
This work was partially supported by Project SIAM (MCT/CNPq/PRONEX grant number 76.97.1016.00) and by CNPq (grant number 467775/00-1). The first and second authors are supported by scholarships from CAPES. The fourth author is supported by NSF (grant number IIS-0083127).
On leave from the University of Amazonas, Brazil.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bonifati, A., AND Ceri, S. Comparative analysis of five XML query languages. SIGMOD Record 29, 1 (2001), 68–79.
Buneman, P., Davidson, S. B., Hillebrand, G. G., AND Suciu, D. A Query Language and Optimization Techniques for Unstructured Data. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (Quebec, Canada, 1996), pp. 505–516.
Buneman, P., Deutsch, A., AND Tan, W. A Deterministic Model for Semistructured Data. In Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats (Jerusalem, Israel, 1999).
Colby, L. S. A Recursive Algebra and Query Optimization for Nested Relations. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (Portland, Oregon, 1989), pp. 273–283.
Deutsch, A., Fernandez, M. F., AND Suciu, D. Storing Semistructured Data with STORED. In Proceedings the 1999 ACM SIGMOD International Conference on Management of Data (Philadephia, Pennsylvania, 1999), pp. 431–442.
Embley, D., Campbell, D., Jiang, Y., Liddle, S., Lonsdale, D., Ng, Y.-K., AND Smith, R. Conceptual-model-based data extraction from multiple-record Web pages. Data & Knowledge Engineering 31, 3 (1999), 227–251.
Evangelista-Filha, I. M. R., Laender, A. H. F., AND Silva, A. S. Querying Semistructured Data By Example: The QSByE Interface. In Proceedings of the International Workshop on Information Integration on the Web (Rio de Janeiro, Brazil, 2001), pp. 156–163.
Florescu, D., Levy, A., AND Mendelzon, A. Database Techniques for the World-Wide Web: A Survey. SIGMOD Record 27, 3 (1998), 59–74.
Goldman, R., AND Widom, J. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proceedings of the 23rd International Conference on Very Large Data Bases (Athens, Greece, 1997), pp. 436–445.
Jaeschke, G., AND Schek, H.-J. Remarks on the Algebra of Non First Normal Form Relations. In Proceedings of the ACM Symposium on Principles of Database (Los Angeles, California, 1982), pp. 124–138.
Laender, A. H. F., Ribeiro-Neto, B., AND Dasilva., A. S. DEByE-Data Extraction By Bxample. Data and Knowledge Engineering 40, 2 (2002), 121–154.
Libkin, L. A Relational Algebra for Complex Objects Based on Partial Information. In Proceedings of the 3rd Symposium on Mathematical Fundamentals of Database and Knowledge Bases Systems (Rostock, Germany, 1991), pp. 29–43.
Lorentzos, N. A., AND Dondis, K. A. Query by Example for Nested Tables. In Proceedings of the 9th International Conference on Database and Expert Systems Applications (Vienna, Austria, 1998), pp. 716–725.
Makinouchi, A. A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model. In Proceedings of the 3rd International Conference on Very Large Data Bases (Tokyo, Japan, 1977), pp. 447–453.
Mchugh, J., Abiteboul, S., Goldman, R., Quass, D., AND Widom, J. Lore: A Database Management System for Semistructured Data. SIGMOD Record 26, 3 (1997), 54–66.
Papakonstantinou, Y., Garcia-molina, H., AND Widom, J. Object Exchange Across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering (Taipei, Taiwan, 1995), pp. 251–260.
Thomas, S. J., AND Fischer, P. C. Nested Relational Structures. Advances in Computing Research 3 (1986), 269–307.
Zloof, M. M. Query-by-Example: A Data Base Language. IBM Systems Journal 16, 4 (1977), 324–343.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
da Silva, A.S., Evangelista Filha, I.M.R., Laender, A.H.F., Embley, D.W. (2002). Representing and Querying Semistructured Web Data Using Nested Tables with Structural Variants. In: Spaccapietra, S., March, S.T., Kambayashi, Y. (eds) Conceptual Modeling — ER 2002. ER 2002. Lecture Notes in Computer Science, vol 2503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45816-6_20
Download citation
DOI: https://doi.org/10.1007/3-540-45816-6_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44277-6
Online ISBN: 978-3-540-45816-6
eBook Packages: Springer Book Archive