6
Integration with a Query Optimizer
The usability criteria and query translation algorithms of the
previous section permit the selection of views that can cor-
rectly be used to answer a query. However, they do not ad-
dress the issue of optimizing queries over a set of views. For
SPJ queries and views, this issue has been addressed [8, 371
by extending the common dynamic programming style op-
timizer [34] to consider plans using views. Two primary
extensions are necessary. First, in addition to access meth-
ods to base relation, the optimizer must consider the use
of views to build a query plan. These views may in turn
describe index structures. Second, it must be possible to
quickly determine what portion of a query a view answers
and whether a plan built from views and other access meth-
ods is a partial or complete answer to the query.
A conventional optimizer begins by finding execution
plans for single relations, then iteratively finds execution
plans for successively larger portions of a query. In the pres-
ence of materialized views or indices described by views, the
initial set of access plans is extended to include these views.
Chaudhuri et al [8] show how for SPJ views, a simple data
structure can be used to record the portion of the query an-
swered by the view. This is possible because an SPJ view
will replace (or answer) a set of tables and predicates in the
query and possibly add a new set of predicates (Conds’ from
the translation algorithm). Using only these sets (tables and
predicates answered by the view and new predicates added
as a result of the view), the optimizer can determine whether
a plan using a set of views is partial or complete. For par-
tial plans, the set of tables and predicates that remain to be
addressed can be computed.
We have shown that for dynamic SPJ views, the portion
of a query answered by a view can also be modeled in the
same way. That is, the query translation process involves
determining a set of tables and predicates from the query
that can be replaced by the view. A query optimizer can
therefore treat dynamic views as primitive access plans irre-
gardless of whether the view is implemented by an external
data source or by an internal index. The higher order prop-
erties of the view must be analyzed to determine if the view
is usable but additional analysis of this information is not
required by the optimizer. In particular, only the index or
external source needs to be able to execute the, possibly
higher order, plan.
Note that this form of integration with a query optimizer
permits correct use of dynamic views while requiring mini-
mal extensions to the optimizer.
7 Conclusions
The existence of schematic heterogeneity in legacy systems
is well document in the research literature. Many of the
more than thirty representations for a single data fact, enu-
merated by Kent result from some type of schematic hetero-
geneity [19]. Despite its prevalence, and despite the plethora
of work on enumerating and categorizing types of schematic
heterogeneity, no systematic study of how this schematically
heterogeneous structures can be used and queried in prac-
tice has been undertaken. We have provided a first step
in such a study. Specifically, we have analyzed the prop-
erties of restructuring transformations required to resolve
schematic discrepancies. Using this analysis we have showed
how higher order views can be used in answering queries on
schematically heterogeneous structures. We have showed
that our solution is powerful enough to meet the needs of
numerous applications drawn from the realms of data ware-
housing, decision support, database publishing and physical
data independence. The solutions allow access methods for
semi-structured and unstructured data to be incorporated
into the framework of structured query evaluation and op-
timization.
References
PI
PI
I31
PI
[51
PI
]71
PI
191
PO1
[III
S. Abiteboul, H. Garcia-Molina, Y. Papakonstanti-
nou, and R. Yerneni. Fusion Queries over Internet
Databases. Technical Report unpublished manuscript,
Stanford University, 1997.
R. Ahmed, P. DeSmedt, W. Du, W. Kent, M. A.
Ketabchi, W. A. Litwin, A. Rafii, and M. C. Shan. The
Pegasus Heterogeneous Multidatabase System. IEEE
Computer, 24(12):19-27, December 1991.
Y. Arens, C. Y. Chee, C. N. Hsu, and C. A. Knoblock.
Retrieving and Integrating Data from Multiple Infor-
mation Sources. Intl. J. of Intelligent and
Cooperative
Info. Systems, 2(2):127-158, 1993.
T. Barsalou and D. Gangopadhyay. M(DM): An Open
Framework for Interoperation of Multimode1 Multi-
database Systems. In Proc. of the Int? Conf. on Data
Eng., pages 218-227, Tempe, AZ, February 1992.
C. Batini, M. Lenzerini, and S. B. Navathe. A Compar-
ative Analysis of Methodologies for Database Schema
Integration. ACM Computing Surveys, 18(4):323-364,
December 1986.
M. L. Brodie and M. Stonebraker. Migrating Legacy
Systems: Gateways, Interfaces, and the Incremental
Approach. Morgan Kaufmann Series in Data Mngmt.
Sys., Jim Gray, Ed. Morgan Kaufmann, 1995.
M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya,
W. F. Cody, R. Fagin, M. Flickner, A. W. Luniewski,
W. Niblack, D. Petkovic, J. Thomas, J. H. Williams,
and E. L. Wimmers. Towards Heterogeneous Multi-
media Information Systems: The Garlic Approach. In
Proc. of the Fifth Int’l IEEE Wksp. on Research Issues
in Data Eng. (RIDE-95): Distributed Object Mngmt.,
Taipei, Taiwan, March 1995.
S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and
K. Shim. Optimizing Queries with Materialized Views.
In PTOC. of the Int’l Conf. on Data Eng., pages 190-200.
IEEE, 1995.
S. Chaudhuri and M. Y. Vardi. Optimization of Real
Conjunctive Queries. In Proc. of the ACM Symp. on
Principles of Database Systems (PODS), 1993.
S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ire-
land, Y. Papakonstantinou, J. Ullman, and J. Widom.
The TSIMMIS Project: Integration of Heterogeneous
Information Sources. In Proc. of the 100th Anniver-
sary Meeting of the Information Processing Society of
Japan(IPSJ), pages 7-18, Tokyo, Japan, October 1994.
W. Chen, M. Kifer, and D. S. Warren. HiLog as a
Platform for Database Languages. In Int’l Workshop
on Database Programming Languages, pages 315-329,
Gleneden Beach, OR, June 1989.
199