Designing and querying XML views based on the ORA SS data model

201 208 0
Designing and querying XML views based on the ORA SS data model

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

DESIGNING AND QUERYING XML VIEWS BASED ON THE ORA-SS DATA MODEL CHEN YA BING (Master of Engineering, Tianjin University, China) A DISSERTATION SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE 2005 Acknowledgements The research presented in this thesis was carried out at the Department of Computer Science, National University of Singapore. Many, Many people have helped me not to get lost during the development of this thesis. Prof. Ling Tok Wang, my main supervisor, has provided a motivating, enthusiastic and critical atmosphere during the many discussions we had. He has patiently guided and advised me throughout the various phases of the research. He has also impressed upon me the importance of critical thinking as a researcher. It was a great pleasure to me to conduct the research under his supervision. Dr. Lee Mong Li, as my second supervisor has provided constructive and inspiring discussions which have many times clarified my ideas. She has also improved both my technical writing and presentation skills. I am very grateful to both of them for their encouragement and support. I also wish to express my gratitude to Ms. Cheng Qiong for many valuable discussions for the research in the thesis. Finally, I would like to thank Mr. He Qi and Mr. Fa Yuan for their useful comments during the course of my work. i Summary XML is emerging as the standard format for data exchange over the Internet. As the amount of XML data increases dramatically, XML views are generally presented on top of source data to enable data exchange. In this thesis, we develop a systematic approach to design valid XML views, and devise two methods to automatically generate query expressions for XML views. These techniques are introduced below: • Design valid XML views: Existing systems for XML views only support select operation applied in the views and not guarantee that the designed views are valid in terms of semantics. We propose a novel method to design valid yet flexible XML views based on the semantically rich Object-Relationship-Attribute model designed for SemiStructured data (ORA-SS), which can express semantics that cannot be expressed in other data models such as XML, DTD or XML Schema, etc. We identify four main view operators for creating XML views, namely, select, drop, join and swap operators. For each operator, we develop a set of rules to guide the design of valid XML views. These rules guarantee the designed views are valid once a view operator is applied. • Generate XQuery view definitions: After designing valid XML views based on the ORA-SS data model with our view operators, we need to generate query expressions for the valid XML views. If the XML data are stored in a native XML database or as XML documents, we develop an algorithm to automatically generate XQuery expressions for the views so that XQuery can be directly executed against XML documents. Further, in cases where a view only involves ii the select operator and does not change the structure of the source schema, the algorithm generates the XQuery expression for the views in a more efficient way. • Generate SQLX view definitions: XML source data are not only stored in native form, but are also increasingly being stored in object-relational databases. Thus, we also develop a method to automatically generate SQLX query expressions for the views. SQLX is the standard extension to SQL for supporting retrieving XML data from traditional databases. By executing SQLX view definitions against the databases, we can directly produce XML view results. The algorithm can efficiently generate the SQLX view definition for an arbitrary ORA-SS view designed with our view operators. Based on the proposed approach, we develop a CASE tool for users to design valid XML views, generate query expressions for the views and execute the query expressions to produce the view documents. To the best of our knowledge, our work is the first to employ a semantic data model for the design and query of XML views. In summary, using a conceptual model for designing and querying XML views not only validates XML views, but also provides a fast and user friendly approach to retrieve XML data. iii Table of Contents Acknowledgements i Summary ii Table of Contents iv Table of Figures vii 1. Introduction 1.1. Background 1.1.1. eXtensible Markup Language (XML) 1.1.2. XML Technologies 1.1.3. XML Data Management 1.2. Problem Statement & Motivation 1 1.3. Research Contributions 10 1.4. Thesis Overview 11 2. Data Models for XML Data 13 2.1. XML DTD 14 2.2. XML Schema 18 2.2.1. Simple types in XML Schema 2.2.2. Complex types in XML Schema 18 19 2.3. OEM Data Model 23 2.4. ORA-SS Data Model 26 2.5. Summary 30 3. Designing Valid XML Views 32 3.1. Motivation 33 3.2. Pre-Processing Steps 37 3.2.1. Extract ORA-SS Source Schema from XML Documents 3.2.2. Enrich ORA-SS Source Schema with Semantics 3.3. View Design Rules 3.3.1 Select Operator 3.3.2. Drop Operator 3.3.3. Join Operator 37 38 38 39 40 48 iv 3.3.4. 3.3.5. 3.3.6. 3.3.7. Swap Operator Aggregate and Order by Operators Design Rules for Participation Constraints in Relationship Design Rules for IDentifier Dependency Relationship 53 63 64 70 3.4. View Validation Algorithm 73 3.5. Summary 74 4. Generating XQuery View Definitions 76 4.1. XQuery Syntax 77 4.2. Motivating Example 82 4.3. Rules for Generating XQuery View Definitions 87 4.3.1. 4.3.2. 4.3.3. 4.3.4. Main Idea Analyzing Vpath Rules for Generating Condition Constraints of an Object Class Rules for Generating Attributes Attached to an Object Class 4.4. Improvements 4.4.1. Reducing redundant condition constraints 4.4.2. Views involving only selection operators 87 89 92 107 113 114 117 4.5. Illustrating Example 121 4.6. XQuery View Definitions Generation Algorithm 124 4.7. Algorithm Analysis 127 4.8. Summary 129 5. Generating SQLX View Definitions 130 5.1. The O-R Database Storage for XML based on ORA-SS 131 5.2. SQLX Syntax 133 5.3. Motivating Example 135 5.4. Rules for Generating SQLX View Definitions 138 5.4.1. Main Idea 5.4.2 DRTs in ORA-SS Views 5.4.3 Generation Rules 138 139 141 5.5. Illustrating Example 156 5.6. SQLX View Definitions Generation Algorithm 159 5.7. Algorithm Analysis 161 5.8. Summary 163 6. CASE Tool 164 v 6.1. Function – Designing valid XML views 165 6.1.1. Load ORA-SS source schema 6.1.2. Design views based on source schema 165 167 6.2. Function – Generating SQLX View Definitions 170 6.3. Function – Producing an XML View Document 171 7. Related Work 173 7.1. Emergence of XML Data Management 173 7.2. View Mechanism in RDB & OODB 175 7.3. XML Views on Relational Data 176 7.4. XML Views on XML Data 177 7.5. XML Views on Integration Systems 180 7.6. Summary 181 8. Conclusions 182 8.1. Summary of Thesis Work 182 8.2. Future Research Directions 184 Bibliography 186 vi Table of Figures Figure 1.1 An XML document on courses and students…………………………….2 Figure 1.2 Architecture of designing and querying XML views based on ORA-SS .11 Figure 2.1 An XML document on students and courses……………………………13 Figure 2.2 The XML DTD for the XML document in Figure 2.1………………… .16 Figure 2.3 The simple type definition for age with restriction………………………19 Figure 2.4 The complex type definition for employee………………………………20 Figure 2.5 An XML schema for the XML document in Figure 2.1………………….22 Figure 2.6(a) The OEM model for the XML document in Figure 2.1……………….24 Figure 2.6(b) The Dataguide for the XML document in Figure 2.1……………….25 Figure 2.7 The ORA-SS schema for the XML document in Figure 2.1…………….29 Table 2.1 Comparison of XML DTD, XML Schema, OEM/Dataguide & ORA-SS……………………………………………………………… 30 Figure 3.1 An XML document on project, supplier and part……………………… 34 Figure 3.2 The ORA-SS source schema of the XML document in Figure 3.1………34 Figure 3.3 The XML DTD of the XML document on Figure 3.1………………….35 Figure 3.4 Invalid XML view …………………………………………………… .36 Figure 3.5 Valid XML view……………………………………………….…………36 Figure 3.6 The XML view applied with a selection operator on Figure 3.2……… 39 Figure 3.7 The XML view dropping supplier in Figure 3.2……………………….40 Figure 3.8 An ORA-SS source schema ………….………………………………… 43 Figure 3.9 The invalid view schema by dropping supplier .……………………….43 Figure 3.10 The valid view schema by dropping supplier …………………………43 Figure.3.11 An ORA-SS source schema ………………. ……………………… .46 Figure 3.12 The invalid view schema …………… .…………………………… ….46 Figure 3.13 The valid view schema……………………… ………………… ……46 Figure 3.14 An ORA-SS schema diagram………………………………………….47 Figure 3.15 The ORA-SS view schema by joining supplier’ and supplier…… .48 Figure 3.16 An ORA-SS source schema………………………………………… .51 Figure 3.17 The invalid view schema by joining supplier’ and supplier ……… 51 vii Figure 3.18 The valid view schema by joining supplier’ and supplier ….……… .52 Figure.3.19 An ORA-SS source schema ………………………………………… .52 Figure 3.20 The ORA-SS view schema swapping supplier and part in Figure 19…52 Figure 3.21 Rel_Set_1(Oi, Oj, S) in an ORA-SS source schema S ……………….54 Figure 3.22 Rel_Set_2(Oi, Oj, S) & Rel_Set_4(Oi, Oj, S) in an ORA-SS source Schema S………………………………………………………………54 Figure 3.23 The ORA-SS source schema for Swapping Oi and Oj ……… .……….57 Figure 3.24 The ORA-SS view schema for Swapping Oi and Oj ………………… .57 Figure 3.25 An ORA-SS source schema for illustrating reversible issue………… 60 Figure 3.26 The invalid ORA-SS view schema swapping course and student in Figure 3.27…………………………………………………………… 60 Figure 3.27 The valid ORA-SS view schema swapping course and student in Figure 3.27…………………………………………………………… .61 Figure 3.28 The valid ORA-SS view schema swapping course and student again in Figure 3.29…………………………………………………………… .61 Figure 3.29 The ORA-SS view schema by applying aggregate operator……………62 Figure 3.30 The ORA-SS view schema by applying order by operator……………62 Figure 3.31 Change of participation constraint due to a swap operator…………….64 Figure 3.32 Functional Dependency Diagram……………………………………….65 Figure 3.33 Change of Participation Constraint due to a projection operation……65 Figure 3.34 An ORA-SS source schema of an IDD relationship type……………….70 Figure 3.35 An ORA-SS view schema of swapping employee and child………… .70 Figure 3.36 An ORA-SS view schema of dropping employee………………………70 Figure 4.1 A sample XML document named book.xml…………………………… 77 Figure 4.2 An XQuery issued on the document book.xml………………………… 80 Figure 4.3 The result of the XQuery in Figure 4.2………………………………… 80 Figure 4.4 A source XML file……………………………………………………… 82 Figure 4.5 The ORA-SS source schema…………………………………………… 82 Figure 4.6 The ORA-SS view schema……………………………………………….82 Figure 4.7 The instance diagram for the source in Figure 4.2……………………….82 Figure 4.8 The instance diagram for the view in Figure 4.3…………………………82 viii Figure 4.9 The view definition in XQuery expression for view in Figure 4.6………83 Figure 4.10 The XML instance for the view in Figure 4.6… ………………………85 Figure 4.11(a) Two simplified ORA-SS source schema………………………… .90 Figure 4.11(b) One simplified ORA-SS view schema………………………… .90 Figure 4.12(a) The case for rule Type I_A…………………………………………94 Figure 4.12(b) Condition constraints generated in Rule Type I_A………………….94 Figure 4.13(a) The case for Rule Type I_B……………………………………… .95 Figure 4.13(b) Condition constraints generated in Rule Type I_B……………… 95 Figure 4.14(a) The case for Rule Type II_A…………………………………….97 Figure 4.14(b) The case for Rule Type II_A……………………………………97 Figure 4.15(a) The Case for Rule Type II_B……………………………………….98 Figure 4.15(b) Condition constraints generated in Rule Type II_B……………… 98 Figure 4.16 The Case for Rule Type III_A…………………………………………100 Figure 4.17(a) The Case for Rule Type III_B………………………………………102 Figure 4.17(b) Where condition generated in Rule Type III_B…………………….102 Figure 4.18(a) The case for Rule Type III_C………………………………………103 Figure 4.18(b) Where condition generated in Rule Type III_C……………………103 Figure 4.19(a) The case for Rule Type III_D………………………………………104 Figure 4.19(b) Where condition generated in Rule Type III_D……………………104 Figure 4.20 The generated clause for Rule Attribute_1…………………………….107 Figure 4.21 The generated clause for Rule Attribute_2…………………………….107 Figure 4.22 The generated clause for Rule Attribute_3…………………………….108 Figure 4.23 The generated clause for Rule Attribute_4……………………………109 Figure 4.24 The generated clause for Rule Attribute_5…………………………….110 Figure 4.25 The generated clause for Rule Attribute_6……………………………111 Figure 4.26 An ORA-SS view schema diagram……………………………………113 Figure 4.27 an ORA-SS view schema diagram applying a selection operator in Figure 4.26……………………………………………………………117 Figure 4.28 The XQuery expression for the view in Figure 4.27………………… 117 Figure 4.29 An ORA-SS source schema……………………………………………120 Figure 4.30 The ORA-SS view schema based on Figure 4.28…………………….120 ix Chapter 7. Related Works 7.2. View Mechanism in RDB & OODB The notion of views is essential in relational databases [81] [82] [83], which has been extensively explored in the context of relational database systems [81] [82] [45]. They increase the flexibility of a database system by allowing users or applications to see data from different viewpoints [8] [26] [66]. The view mechanism in relational databases can be implemented by using a straightforward modification technique [76]. Similarly, the topic of views has also been examined in the object-oriented database context [4] [13] [36] [68] [71] [74] [80]. A view mechanism in the OODB context is more complicated then its analogue in the relational databases, as the view mechanism in the OODB not only restructures data but also integrates operations on data [70] [69]. As semi-structured data emerges on the Web, a view mechanism for semi-structured databases has also been proposed in [7], which introduces many new problems because of the nature of semi-structured data. Furthermore, the issues related to views for semi-structured data, such as materialized view maintenance, have also been discussed in [77] [86]. As increased XML data has appeared on the Web and development of XML data management systems has growed, the view mechanism for XML data has also been examined. The view mechanism for XML data management is even more crucial than the analogue in relational databases, as it can be used to integrate heterogeneous sources and add a structured interface on top of some otherwise semi-structured data [2]. Currently, there has been a lot of work on XML views of relational data or XML data. We discuss the work on XML views of relational data in section 7.3. In section 7.4, we then discuss XML views on top of XML data. 175 Chapter 7. Related Works 7.3. XML Views on Relational Data As XML becomes the standard for data exchange on the web, existing relational data are published as XML views to exploit the potential of XML. There has been a lot of work on this scenario. [89] has described the details of most of the recent works on this scenario. SilkRoute [30] [31] adopts two declarative languages RXL and XML-QL to define and query views over relational data respectively. XPERANTO [18] [19] [72] [73] uses a canonical mapping to create a default XML view from relational data, and other views can be defined on top of the default view. XQuery views [40] are also supported over the XML views in XPERANTO. Instead of adopting XQuery, ROLEX [14] [47] composes XSLT stylesheet with defined XML views to produce a new XML view definition. On the other hand, [43] presents an algorithm to translate XSLT scripts over XML views into efficient SQL queries. Having all these work on translating XML query into SQL query, [90] then focuses on the efficiency of the SQL queries generated by the translation process. It concludes that the quality of the resulting SQL should be a concern of the translation algorithm itself, rather being left in the hands of a traditional relational optimizer. However, it only supports path queries. Further, [98] introduces a new operator to support relation-valued variables in relational engines so that it can be enhanced for efficient XML publishing. Next, in [97], an efficient XQuery complier is proposed in the purely relational context, which translates not only path queries, but also a core set of XQuery queries into SQL. All the research work above has considered the issue of query translation for XML views. On the other hand, there has also been some research work focusing only on 176 Chapter 7. Related Works the mapping process of relational data to XML views. [88] provides a language for defining XML views that are guaranteed to be DTD-conformant, as well as a middleware for evaluating these views. As a part of MIX project [12] [63], [11] proposes a method to translate a relational schema to a view graph guided by a user. [25] directly translates a relational schema to an XML tree structure with the help of the semantically rich data model – ORA-SS [24]. In addition, major commercial database systems have provided the ability to export relational data to materialized XML views. In Oracle XML DB [60], XML views are defined by using the forthcoming SQL/XML standard, which is an extension to SQL [75]. Oracle XML DB can only support XPath queries on XML views, which will be translated into an equivalent SQL query. Microsoft SQL Server 2000 [58] defines an XML view with an annotated XSD XML schema and supports XPath queries over the annotated XML Schema. IBM DB2 XML Extender [42] uses a Document Access Definition (DAD) file to define an XML view. However, it does not support any XML query languages over the XML view. In addition, IBM XML for Tables [27] provides an XML view of relational tables and a query of those views as if they were XML documents based on the XPERANTO project [18] [19] [72] [73]. Unlike the related work in this scenario, which publish relational data into XML views, our work in this thesis focuses on presenting XML views on top of XML data. 7.4. XML Views on XML Data In this scenario, XML views are presented on top of XML data. In particular, XML data can be stored in two main ways. One is to store XML data into traditional 177 Chapter 7. Related Works databases. The other is to store XML data as text files or in native XML databases. Our work in this thesis considers both cases. Firstly, for the case where XML data are stored in traditional databases, most of the related work assumes XML data is stored in relational databases. STORED [91] stores XML data into relational databases by using data mining techniques. It also proposes an algorithm to translate an input STORED query into SQL. In [92], an edge approach and an attribute approach are proposed for storing XML data in relational databases. This technology also translates basic operations in a path expression to SQL. XRel [93] uses a path-based approach to store XML data, and a core part of XPath is identified for translating into SQL. In [94], ordered XML data are considered to be supported by the unordered relational data model. This work proposes algorithms for translating ordered XPath expressions into SQL. [95] stores all XML data in a single table containing a tuple for each element, attribute and text node. This approach in [95] can also support XQuery with arbitrarily nested FLWR expressions. Finally, [46] presents a generic algorithm to translate path expression queries into SQL in the presence of recursion in the schema and queries. Unlike the related work above, our corresponding work in this thesis (chapter 5) stores XML data into an object-relational database based on an ORA-SS data model. This storage technique keeps semantics implied in the XML data and removes unnecessary redundancies, which may exist in the other technologies. Further, we propose an algorithm to generate SQLX view definitions from ORA-SS views, which can then be executed to produce materialized XML views. This alleviates users from manually writing complicated SQLX view definitions. 178 Chapter 7. Related Works Secondly, there have also been several new technologies proposed for the case where XML data are stored as text files or in native XML storage. Xyleme [21] [51] defines an XML view by connecting one abstract DTD to a large collection of concrete DTDs with an extension of OQL as the query language. ActiveView [3] [6] defines views with active features, such as method calls and triggers, on ArdentSoftware’s XML repository using a view specification language. Other technologies in this case discuss several sub issues of XML views on top of XML data. The issue of DTD inference for views of XML data is examined in [64]. It extends the descriptive ability of DTD and shows that the extended DTD can be always inferred for a selection view. [44] proposes another view inference approach to automatically derive an integrated XML view on heterogeneous XML DTDs. Instead of using a query language to define views, [52] defines views through source schema and view schema mappings. In [67], the focus is then on view definition of XML data at the conceptual level and the semantics required in accommodating such view mechanisms at this higher level of abstraction. Unlike the related work in this case, our corresponding work in this thesis (chapter 4) automatically generates XQuery view definitions from ORA-SS views. The generated XQuery view definitions can then be evaluated on native XML databases or XQuery engines to produce materialized XML views. This alleviates users from manually writing complicated XQuery view definitions. Further, unlike the related work in the two cases above, our work does not consider the issue of translating XML queries on XML views into SQLX queries on the objectrelational database or XQuery on the native XML database. However, the view 179 Chapter 7. Related Works operators proposed in our work can be treated as query operators to issue queries on ORA-SS views, which can then be translated into new ORA-SS view schemas. Thus, we can generate the SQLX or XQuery view definitions from the new view schema by employing the algorithm in our work. In this way, our work can be extended to support translating query operators on ORA-SS views into SQLX queries on the object-relational database. 7.5. XML Views on Integration Systems Since XML views can be presented on top of relational data and XML data, it will be natural for XML views to be presented as a middleware in integration systems. Thus, there has also been work on this scenario. The MIX system [11] [12] [63] adopts a DTD as a mediator to assist users in query formulation and query processors and its query language is a subset of XML-QL. The XML version of YAT [20] then proposes a generic algebra for XML query evaluation. It also discusses optimization techniques for XML-based integration system. Agora [53] [54] uses a LAV (local-asview) approach and provides an algorithm for translating XQuery FLWR expression into SQL in the context of heterogeneous data sources. It first translates the XML query into a SQL query on a generic, virtual relational schema, and then rewrites this SQL query into a SQL query over the real relational schema. The MARS system [23] [96] supports both GAV (global-as-view) and LAV views. It exploits integrity constraints on both the relational and XML data and compiles the queries, views and constraints from XML into the relational framework. 180 Chapter 7. Related Works 7.6. Summary All the related work on XML views in this chapter exploit the potential of XML by exporting their data into XML views. As mentioned before, our work in this thesis belongs to the scenario in which XML views are presented on top of XML data. Unlike the related works in this scenario, our work in this thesis considers semantic information when designing XML views (chapter 3). In particular, we design XML views based on a semantically-rich data model (ORA-SS). By developing a set of design rules, our work guarantees the validity of XML views, while the related work cannot. In addition, most related work used query languages to define XML views, which may be complicated in expressing the views. By contrast, our work proposes several simple view operators to design XML views, which are easy to use and can still be used to design flexible yet valid XML views. Finally, our work automatically generates XQuery/SQLX query expressions for the designed views, which thus alleviate users from manually writing complicated query expressions for the views. 181 Chapter 8. Conclusions Chapter Conclusions 8.1. Summary of Thesis Work XML views exploit the potential of XML as the standard for exchanging data on the Internet. As views, they also secure underlying source data and provide an application-specific view. In this thesis, we examined how to design valid XML views and generate query expressions of XML views based on source XML data. Before we presented the main work, we introduced a novel data model for semistructured/XML data, i.e. ORA-SS. The ORA-SS data model is a semantically rich data model. It not only reflects the nested structure of XML data, but also distinguishes between object classes, relationship types and attributes. It is also possible to specify the degree of n-ary relationships and indicate if an attribute is an attribute of a relationship or an attribute of an object class. These semantics are lacking in other existing semi-structured/XML data models including OEM, XML DTD and XML Schema. In designing XML views, these semantics are critical in ensuring that the designed views are valid. That is, they are consistent with the source schema in terms of semantics. We also use the semantics expressed in an ORA-SS data model to generate query expressions of XML views. Based on the ORA-SS data model, we first presented a method of designing valid XML views. In this method, an ORA-SS source schema is first extracted from the XML source data. Then the source schema is enriched with necessary semantics with 182 Chapter 8. Conclusions the help of user inputs. Finally, based on the enriched ORA-SS source schema, we are able to design valid XML views. We adopted four main view operators to design the views. They are the selection, drop, join and swap operators. For each operator, we presented a set of rules to guide the design of valid XML views with the operator. All the rules guarantee that the designed views are valid in terms of semantics in the source schema. Having the designed ORA-SS views, we provided a way to generate query expressions for the views. Since the two main storage structures for XML data are native storage (XML documents) and object-relational storage, we proposed two algorithms to generate different query expressions for XML views based on the two different storage structures. In the first case, XML data are stored in XML documents. We generate XQuery view definitions for the XML views. XQuery is the standard XML query language from W3C. However, it is difficult for users to write manually the XQuery view definition for XML views. Thus, we proposed an algorithm to automatically generate XQuery view definitions for the designed XML views, which removes the need for users to manually write the view definitions. Furthermore, we developed an improvement version of the algorithm by utilizing the semantics of ORA-SS views, such as relationship types in the views. The improved version also separately handles the views involving selection operators only and the rest of the views. In the second case, XML data are stored in an object-relational database, which removes a lot of redundancies existing in the first case. We developed an algorithm to automatically generate SQLX view definitions from the XML views. SQLX is the 183 Chapter 8. Conclusions standard extension of SQL to process XML. By executing SQLX queries against an object-relational database, we can directly produce XML result from the objectrelational database. This algorithm also utilized the semantics expressed in the ORASS views to generate the view definitions for the views. To the best of our knowledge, our work is the first one to employ a semantic data model for the design and query of XML views. Compared to other related work, our work enables us to design flexible yet valid XML views. In addition, our work automatically generates query expressions for the views, while others require users to manually write the query expressions. Our work also provides a graphical CASE tool to facilitate the design and querying of XML views. In summary, using a conceptual model for the design and querying of XML views not only validates XML views, but also provides a fast and user friendly approach to retrieve XML data. 8.2. Future Research Directions Firstly, the thesis work can be easily extended to support querying on ORA-SS views by using our view operators. That is, the view operators can be used as query operators. We can use the query operators to compose queries on views. In particular, a query on a view involves only selection operators in most cases. Thus, we only need to compose the view definition generated by the algorithms in the thesis with these selection operators by directly inserting these operators into the corresponding where clauses in the view definition. In the rest of the cases, a query on a view may involve more complex operators, such as drop, swap or join operators. Then we directly apply these operators to the view and generate an intermediate view tree that is the result of the query. Next we use the algorithms proposed in the thesis here to generate the 184 Chapter 8. Conclusions query definition for the intermediate ORA-SS view tree. In this way, we are able to map any query on ORA-SS views into an equivalent query on the underlying source XML data. Secondly, XML view update is a natural extension to the thesis work. It has two issues to be explored. The first issue is the updatability of XML views. In other words, we need to examine if an XML view is able to be updated. The second issue is how to update those updatable XML views. On the other hand, materialized XML view maintenance should also not be neglected in the future work. Thirdly, the following areas can also be considered as the continuing work: XML views on top of views with a number of constraint enforcement strategies may be used without problems. The transformation mechanism in the thesis can be more powerful with the help of advanced ER models. The treatment of cardinality constraints, functional dependencies and their derivation discussed in Chapter and Chapter can also be handled in a more advanced form. Moreover, the object relational storage of XML can be seen as database transformation (from XML data to OR data). It deserves further research from this direction in the future work. The achievements made by other ER researchers can also be taken into consideration for the extension of the work. For instance, the higher order entity-relationship has a sound foundation and deserves to be explorer further. Moreover, the main proofs in the thesis based their argumentation on set semantics. We can also extend them to support list semantics. Finally, optimizing the XQuery/SQLX queries generated in this work is another future work. 185 Bibliography 1. Serge Abiteboul. Querying semi-structured data. ICDT 1997, pp. 1-18. 2. Serge Abiteboul. On Views and XML, PODS 1999, pp. 1-9. 3. Serge Abiteboul, Vincent Aguilear, Sebastien Ailleret, et. al. XML repository and Active Views Demonstration, VLDB Demo 1999, pp.742-745. 4. Serge Abiteboul, Anthony Bonner. Objects and Views. SIGMOD 1991, pp. 238-247. 5. Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web. Morgan Kauffman, 1999. 6. Serge Abiteboul, Sophie Cluet, Laurent Mignet, et. Al. Active views for electronic commerce, VLDB 1999, pp.138-149. 7. Serge Abiteboul, Roy Goldman, Jason McHugh, et. al. Views for semistructured data. In Proceedings of the Workshop on Management of Semistructured Data 1997, pp. 8390. 8. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. AddisonWesley, Reading-Massachusetts, 1995. 9. Serge Abiteboul, Jason McHugh, et, al Incremental maintenance for materialized views over semistructured data. VLDB 1998, pp. 38-49. 10. Serge Abiteboul and Victor Vianu. Queries and computation on the Web. ICDT 1997, pp. 262-275. 11. Chaitanya Baru. Xviews: XML views of relational schemas. DEXA Workshop 1999, pp.700-705. 12. Chaitanya Baru, Amarnath Gupta, Bertram Ludaescher, et. al. XML-Based Information Mediation with MIX, SIGMOD Demo 1999, pp.597-599. 13. Eliso Bertino. A View Mechanism for Object-Oriented Databases. EDBT 1992, pp 136-151. 14. Philip Bohannon, Sumit Ganguly and Henry Korth, et. al. Optimizing view queries in ROLEX to support navigable tree results. VLDB 2002, pp. 119-130. 15. Peter Buneman, Susan Davidson and Gerd Hillebrand, et. al. A query language and optimization techniques for unstructured data. SIGMOD 1996, pp.505-516. 16. Peter Buneman, Susan Davidson, Wenfei Fan, et. al. Keys for XML. WWW 2001, pp. 201-210. 17. Peter Buneman. Semistructured data. PODS 1997, pp. 117-121. 18. Michael Carey, Daniela Florescu, Zachary Ives, et. al. XPERANTO: Publishing Object-Relational Data as XML, WebDB Workshop, 2000, pp.105-110. 19. Michael Carey, Jerry Kiernan, Jayavel hanmugasundaram, et. al. XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents, VLDB Demo 2000, pp. 646-648. 20. Vassilis Christophides, Sophie Cluet, Jerome Simeon. On Wrapping Query Languages and Efficient XML Integration, SIGMOD 2000, pp. 141-152. 186 Bibliography 21. Sophie Cluet, Pieanglo Veltri, Dan Vodislav, Views in a large scale XML repository, VLDB 2001, pp. 271-280. 22. Alin Deutsch, Mary Fernandez, Daniela Florescu, et, al. Querying XML Data. IEEE Data Engineering. Bulletin 1999, pp. 10-18. 23. Alin Deutsch and Val Tannen. MARS: A System for Publishing XML from Mixed and Redundant Storage. VLDB 2003, pp.203-212. 24. Gillian Dobbie, Xiao Ying Wu, Tok Wang Ling, Mong Li Lee, ORA-SS: An ObjectRelationship-Attribute Model for SemiStructured Data, Technical Report TR21/00, School of Computing, National University of Singapore, 2000. 25. Wen Yue Du, Mong Li Lee, Tok Wang Ling, XML Structures for Relational Data, WISE 2001, pp. 151-160. 26. Ramez Elmasri and Shamkant B. Navathe. Fundamentals of Database Systems. Benjamin/Cummings Publishing Company, Inc., Redwood City, California, second edition, 1994. 27. Catalina Fan, John Funderburk, Hou-in Lam, Et. al. XTABLES: Bridging Relational Technology and XML, IBM Research Report,2002. http://www7b.boulder.ibm.com/dmdd/library/techarticle/0203shekita/0203shekita.pdf 28. Leonidas Fegaras and Ramez Elmasri. Query engines for Web-accessible XML data. VLDB 2001, pp. 251-260. 29. Mary Fernandez, Daniela Florescu, Alon Levy, and Dan Suciu. A query language and processor for a web-site management system. In Workshop on Management of Semistructured Data, SIGMOD 1997, pp. 4-11. 30. Mary Fernandez, Wang-Chiew Tan, Dan Suciu, SilkRoute: Trading Between Relations and XML, WWW 2000, pp.723-745. 31. Mary Fernandez, Wang-Chiew Tan, Dan Suciu, “Efficient Evaluation of XML Middleware Queries”, SIGMOD 2001, pp. 103-114. 32. Daniela Florescu, Alon Levy and Alberto Mendelzon. Database Techniques for the WorldWide Web: A Survery. ACM SIGMOD Record 1998, pp.59-74. 33. Daniela Florescu, Alon Levy, and Dan Suciu. Query containment for conjunctive queries with regular expressions. PODS 1998, pp. 139-148. 34. Hector Garcia_Molina et al. The TSIMMIS approach to mediation: data models and languages. Journal of Intelligent Information Systems 1997, pp.117-132. 35. Roy Goldman, Jason McHugh, and Jennifer Widom. From semistructured data to XML: Migrating the Lore data model and query language. WebDB Workshop 1999, pp. 25-30. 36. Sandra Heiler and Stanley B. Zdonik. Object Views: Extending the Vision. ICDE 1990, pp. 86-93. 37. http://www.w3.org/DOM/ 38. http://www.w3.org/TR/xslt 39. http://www.w3.org/XML/ 40. http://www.w3.org/XML/Query 187 Bibliography 41. http://www.w3.org/XML/Schema 42. IBM DB2 . http://www- 3.ibm.com/software/data/db2/extenders/xmlext/index.html 43. Sushant Jain, Ratul Mahajan, and Dan Suciu. Translating XSLT Programs to Efficient SQL Queries. WWW 2002, pp. 616-626. 44. Euna Jeong, Chun-Nan Hsu. Induction of Integrated View for XML Data with Heterogeneous DTDs. CIKM 2001, pp. 151-158. 45. Henry Korth and Abraham Silberschatz. Database System Concepts. McGraw-Hill, New York, 1991. 46. Rajasekar Krishnamurthy, Venkatesan T. Chakaravarthy, Raghav Kaushik, et, al. Recursive XML Schemas, Recursive XML Queries, and Relational Storage: XML-toSQL Query Translation. ICDE 2004, pp. 42-53. 47. Chengkai Li, Philip Bohannon, Henry Korth, et, al. Composing XSL Transformations with XML Publishing Views. SIGMOD 2003, pp. 515-526. 48. Chen Li, Ramana Yerneni, Vasilis Vassalos, et, al. Capability based mediation in tsimmis. SIGMOD Demo 1998, pp. 564-566. 49. Tok Wang Ling, Mong Li Lee, Gillian. Dobbie. Applications of ORA-SS: An ObjectRelationship-Attribute Model for Semistructured Data. IIWAS 2001, pp 17-28. 50. Meng Chi Liu, Tok Wang Ling. Towards Declarative XML Querying. WISE 2002, pp. 127-136. 51. Lucie Xyleme. A dynamic warehouse for XML data of the Web. IEEE Data Engineering Bulletin, 2001 52. Daofeng Luo, Ting Chen, Tok Wang Ling, Xiaofeng Meng. On View Transformation Support for a Native XML DBMS. DASFAA 2004, pp. 226-231. 53. Ioana Manolescu, Daniela Florescu, Donald Kossmann. Answering XML Queries over Heterogeneous Data Sources, VLDB 2001, pp.241-225. 54. Ioana Manolescu, Daniela Florescu, Donald Kossmann, et, al. Agora: Living with XML and relational. VLDB Demo 2000, pp.623-626. 55. Alberto Mendelzon and Tova Milo. Formal models of the Web. PODS 1997, pp.134143. 56. Jason McHugh, Serge Abiteboul, Roy Goldman, et, al. Lore: A Database Management System for Semistructured Data, Technical Report, Stanford University Database Group, Feb. 1997. 57. Jason McHugh, Serge Abiteboul, Roy Goldman, et, al. Lore: A database management system for semistructured data. SIGMOD 1997, pp. 54-66. 58. Microsoft Corp. http://www.microsoft.com/XML. 59. Yuan Yin Mo, Tok Wang Ling, Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database, WISE 2002, pp. 247-256. 60. Oracle Corp. http://www.oracle.com/XML. 61. Yannis Papakonstantinou, Hector Garcia-Molina, and Jennifer Widom. Object Exchange Across Heterogeneous Information Sources. ICDE 1995, pp. 251-260 188 Bibliography 62. Yannis Papakonstantinou and Vasilis Vassalos. Query rewriting using semistructured views. SIGMOD 1999, pp.455-466. 63. Yannis Papakonstantinou and Pavel Velikhov. Enhancing Semistructured Data Mediators with Document Type Definitions. ICDE 1999, pp. 136-145. 64. Yannis Papakonstantinou and Victor Vianu. DTD inference for views of XML data. PODS 2000, pp. 35-46. 65. Dallan Quass, Jennifer Widom, Roy Goldman, et, al. LORE: A Lightweight Object REpository for Semistructured Data, SIGMOD 1996, pp. 549. 66. Raghu Ramakrishnan. Database Management Systems. McGraw-Hill, 1997. 67. Rajagopal Rajugan, Elizabeth Chang, Tharam S. Dillon, et, al. XML Views: Part 1. DEXA 2003, pp. 148-159. 68. Elke Rundensteiner and Lubomir Bic. Automatic View Schema Generation in ObjectOriented Databases. Technical Report 92-15, Department of Information and Computer Science, University of California, Irvine, Jan. 1992. 69. Cassio Santos. Design and implementation of object-oriented views. Lecture Notes in Computer Science, 978, 1995 70. Cassio Santos, Serge Abiteboul, Claude Delobel, Virtual Schemas and Bases, EDBT 1994, pp. 81-94. 71. Marc H. School, Christian Laasch, and Markus Tresch. Updatable Views in ObjectOriented Databases, DOOD 1991, pp. 189-207. 72. Jayavel Shanmugasundaram et al, Efficiently Publishing Relational Data as XML Documents, VLDB 2000, pp.65-76. 73. Jayavel Shanmugasundaram, Jerry Kiernan, Eugene Shekita, et, al. Querying XML Views of Relational Data, VLDB 2001, pp.261-270. 74. John J. Shilling and Peter F. Sweeney. Three Steps to Views: Extending the ObjectOriented Paradigm. OOPSLA 1989, pp.353-361. 75. SQLX. http://www.sqlx.org 76. Mike Stonebraker. Implementation of Integrity Constraints and Views by Query Modification. SIGMOD 1975, pp. 65-78. 77. Dan Suciu. Query Decomposition and View Maintenance for Query Languages for Unstructured Data. VLDB 1996, pp. 227-238. 78. Dan Suciu. An overview of semistructured data. SIGACT News 1998, pp. 28-38. 79. Dan Suciu. On Database Theory and XML. SIGMOD Record 2001, pp. 39-45. 80. Katsumi Tanaka, Masatoshi Yoshikawa, and Kozo Ishihara. Schema Virtualization in Object-Oriented Databases. ICDE 1988, pp. 23-30. 81. Jeffrey D. Ullman. Principles of Database and Knowledge Base Systems, Volume I. Computer Science Press, 1988. 82. Jeffrey D. Ullman. Principles of Database and Knowledge Base Systems, Volume II: The New Technologies. Computer Science Press, 1989. 83. Jeffrey D. Ullman and Jennifer. Widom. A First Course in Database Systems. Pretice Hall, 1997. 189 Bibliography 84. Victor Vianu. A Web Odyssey: from Codd to XML. PODS 2001, pp. 1-15. 85. Jennifer Widom. Data Management for XML: Research Directions. IEEE Data Engineering Bulletin, 1999, pp. 44-52. 86. Yue Zhuge and Hector Garcia-Molina. Graph Structured Views and Their Incremental Maintenance. ICDE 1998, pp 116-125. 87. Stefano Ceri, Sara Comai, Ernesto Damiani, et. al. XML-GL: a graphical language of querying and restructuring XML documents, SEBD 1999, pp. 151-165. 88. Michael Benedikt, Chee Yong Chan, Wenfei. Fan, et, al. DTD-Directed Publishing with Attribute Translation Grammars. VLDB 2002, pp. 838-849. 89. Rajasekar Krishnamurthy, Raghav Kaushik, and Jeffrey F. Naughton. XML-SQL Query Translation Literature: The State of the Art and Open Problems. In XML Database Symposium, 2003, pp. 1-18. 90. Rajasekar Krishnamurthy, Raghav Kaushik, Jeffrey F. Naughton. Efficient XML-toSQL Query Translation: Where to Add the Intelligence? VLDB 2004. 91. Alin Deutsch, Mary Fernandez, and Dan Suciu. Storing semistructured data with STORED. SIGMOD 1999, pp. 431-442. 92. Daniela Florescu, Donald Kossman. Storing and Querying XML Data using an RDBMS. Data Engineering Bulletin, 22(3), 1999, pp.27-34. 93. Masatoshi Yoshikawa, Toshiyuki Amagasa, Takeyuki Shimura, Shunsuke Uemura, XRel: a path-based approach to storage and retrieval of XML documents using relational databases. TOIT 2001, pp. 110-141. 94. Igor Tatarinov, Stratis Viglas, Kevin S. Beyer, et, al. Storing and querying ordered XML using a relational database system. SIGMOD 2002, pp. 204-215. 95. David DeHaan, David Toman, Mariano P. Consens, M. Tamer Özsu, A Comprehensive XQuery to SQL Translation using Dynamic Interval Encoding. SIGMOD 2003, pp. 623-634. 96. Alin Deutsch, Val Tannen. Reformulation of XML Queries and Constraints. ICDT 2003, pp. 225-241. 97. Torsten Grust, Sherif Sakr, Jens Teubner: XQuery on SQL Hosts. VLDB 2004. 98. Surajit Chaudhuri, Raghav Kaushik, and Jeffrey Naughton. On relational support for XML publishing: Beyond sorting and tagging. SIGMOD 2003, pp. 611-622. 99. Marcelo Arenas and Leonid Libkin. A Normal Form for XML Documents. ACM Transactions on Databases Systems (TODS), 29(1):195-232, 2004. 100. M. Vincent, J. Liu and C. LIU, Strong Functional Dependencies and a Redundancy Free Normal Form for XML, ACM Transactions on Database Systems (TODS) 29(3): 445-462, September 2004. 190 [...]... Structured data (ORA- SS) [24] to express the schema of XML source data and XML views We define a set of view operators to design XML views based on ORA- SS data model By employing the semantics enriched in ORA- SS, we also develop a set of rules to guarantee that the designed XML views are valid As the schema of XML views are expressed in ORA- SS, the schema for the XML views are thus called ORA- SS views The. .. design XML views graphically Finally, most current related work considers XML views on top of relational database That is, the source data for XML views are relational data Some other work considers XML views on top of XML data That is, the source data are XML data However, currently no work considers designing flexible XML views for the case where XML data are stored in traditional database Thus, there... between XML views and ORA- SS views are as follows: • XML views denote the XML documents for designed views 8 Chapter 1 Introduction • ORA- SS views denote the ORA- SS schema diagram of designed XML views In another words, an arbitrary XML view document can be called an XML view for short Its corresponding ORA- SS schema diagram can then be called an ORA- SS view Note that we assume that the ORA- SS schema... generated from the ORA- SS views On the other hand, when XML data are stored in the object-relational database system, SQLX [75] view definitions are generated from the ORA- SS views We formalize the issues addressed in this thesis above as follows: Valid XML Views Problem Given an ORA- SS source schema S of XML data D, and a set of view operators, i.e select, drop, join and swap, to design an ORA- SS view V,... definitions for the designed valid XML views in the case where XML data are stored in an objectrelational database 1.4 Thesis Overview The rest of the thesis is organized as follows Chapter 2 introduces some of the main data models for XML data as well as the semantically rich ORA- SS data model The advantages of ORA- SS over other data models are also presented Chapter 3 presents the view operators based on. .. That is, the issue of the validity of XML views is the same as the issue of the validity of ORA- SS views After we develop the set of rules for the validity of XML views or ORA- SS views, we develop algorithms to automatically generate query definitions from the ORA- SS views, as the ORA- SS views are graphical schema diagrams When XML data are stored in native form, XQuery [40] view definitions are generated... schema V and its ORA- SS source schema S, as well as its ORDB storage T generate a SQLX view definition for V, which can be directly evaluated on the storage T 1.3 Research Contributions To solve the three problems discussed, we employ a semantically rich data model – Object-Relationship-Attribute model for Semi Structured data (ORA- SS) [24] to express the schema of XML data Based on the ORA- SS data model, ... definitions, the XML view documents can be directly produced 10 Chapter 1 Introduction Valid ORA- SS view schema Generating Designing ORA- SS source schema XQuery view definitions SQLX view definitions Extracting Executing XML files XML data An Object-relational database Executing Figure 1.2 The Architecture of designing and querying XML views based on ORA- SS In summary, the several research contributions... novel approach to designing and querying XML views on XML source data The architecture of our approach is shown in Figure 1.2 Firstly, an ORA- SS schema is extracted from XML data, XML DTD or XML Schema as a pre-process task The XML data are stored as XML files or in an objectrelational database Based on the extracted ORA- SS schema, we employee a set of view operators to design XML views A set of rules... on XML data There are several advantages for XML views Firstly, 6 Chapter 1 Introduction XML views provide application specific views of source data Secondly, XML views secure the source data by hiding the part users are not allowed to see Thirdly, XML views provide for a basis for further data integration Finally, XML views enable us to exploit the potential of XML as the standard of data exchange . XML views are expressed in ORA- SS, the schema for the XML views are thus called ORA- SS views. The difference between XML views and ORA- SS views are as follows: • XML views denote the XML documents. in this thesis, if an ORA- SS view is valid, then an XML view conforming to the ORA- SS view is also valid. That is, the issue of the validity of XML views is the same as the issue of the validity. views based on the ORA- SS data model with our view operators, we need to generate query expressions for the valid XML views. If the XML data are stored in a native XML database or as XML documents,

Ngày đăng: 16/09/2015, 17:14

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan