Abstract

We review the problem of finding contained rewritings (CRs) for XPath queries using XPath views. CR is proposed to cater for data integration scenarios, where views are unlikely to be complete due to the limited coverage of data sources, and hence equivalent rewritings are impossible to be found. As a result, we are usually required to find a maximal contained rewriting (MCR) for a query to provide the best possible answers. An MCR is a set of CRs, and may contain redundant CRs. Obviously, evaluating redundant CRs on materialized views is unnecessary. In this paper, we first address how to find the irredundant maximal contained rewriting (IMCR), i.e. all the irredundant CRs. We show that the existing approach ignores a type of situation, and turns out to be not sufficient. As a result, the only safe solution is a brute-force pairwise containment check for all the CRs. We then propose some heuristics to speed up the brute-force comparisons. When a materialized view is given, we propose how to evaluate the IMCR on the materialized view, which, to our knowledge, is the first work on optimizing the evaluation of a set of produced CRs on the materialized view by considering the inherent structural characteristics of the CRs. Our experiments show the effectiveness and efficiency of our algorithms.

This content is only available as a PDF.
You do not currently have access to this article.