Tài liệu Managing time in relational databases- P10 docx

20 275 1
Tài liệu Managing time in relational databases- P10 docx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

transactions until it is the right time to apply them, Asserted Versioning applies them right away, but does not immediately assert them. These deferred assertions may themselves be updated or deleted, and the moment on which their assertion periods become current is the moment on which we begin to claim that the world was, is or will be as they describe it. Just as deferred assertions replace collections of transac tions that have not yet been applied to the database, bi-temporal data in any of the other seven categories replaces other physically external datasets. Asserted version tables contain data in all these temporal categories and, in doing so, internalize what would otherwise be physically distinct datasets, ones whose management costs are obviously significant. In Chapter 13, we look more closely at the entire family of pipeline datasets. We distinguish eight logical categories of pipe- line datasets, based on where in a combination of past, present or future assertion and effective time their data is located. Hav- ing previously shown how to eliminate these physically distinct datasets by bringing them into the production tables which are their destinations and points of origin, we now discuss each of them and show how queries and views can reassemble, as queryable objects, exactly the data that had existed in those datasets. This demonstrates that while eliminating the manage- ment costs associated with this data, we can still make this data available in whatever combinations it is needed. In Chapter 14, we discuss how to query asse rted version tables. As we said before, many queries, especially t he ad hoc queries written by non-technical database users, will be directed against non-temporal or uni-temporal views of asserted version tables, not agai nst those bi-temporal tab les themselves. But many queries will b e writt en directly against those physical tables, especially those we call production queries. In that case, the effective time period specified on the query, and w hich qualifies the result set, w ill have t o be com- pared to the effective time periods of the rows targeted by the query; and as we know from our re view of the Allen relationships, there are 13 different ways in which those two time periods may b e posi tioned with respect to one anot her. And when those queries i nvolve joins across two (or more) asserted version tab les, then the Allen relationship issues can become even more difficult. In Chapter 15, we discuss how to op timize the performance of Asserted Versioning databases. Our focus is on optimizing access to currently asserted current versions, i.e. to the rows that correspond to rows in a conventi onal table of persistent objects. 164 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES In this chapter, we focus on index design, although a wide range of other optimization techniques are also considered. In Chapter 16, we conclude our presentation of Asserted Versioning. We discuss each of the four objectives we had for Asserted Versioning, and which we described in the Preface, and explain why we think those objectives have been met. We point out that Asserted Versioning has value both as a bridge to a future standards-based and vendor-provided implementation of bi-temporal data, and as a destination, being itself a semanti- cally complete implementation of bi-temporal data which works with today’s SQL and today’s databases. In the last section, we discuss ongoing research and development at Asserted Versioning LLC, and explain how interested readers can learn more about Asserted Versioning. Glossary References Glossary entries whose definitions form strong inter- dependencies are grouped together in the following list. The same Glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be sev- eral other, and often many other, Glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered. We no te, in particular, that none of the nodes in our taxon- omy of data management methods, or our state transformation taxonomy, are included in this list. In general, we leave taxon- omy nodes out of these lists, but recommend that the reader look them up in the Glossary. Allen relationships asserted version table Asserted Versioning Framework (AVF) assertion time transaction time bi-temporal uni-temporal deferred assertion deferred transaction Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES 165 effective time valid time object persistent object physical transaction temporal transaction temporal parameter pipeline dataset production table temporal entity integrity (TEI) temporal referential integrity (TRI) temporal extent state transformation the standard temporal model 166 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES CONTENTS Translating a Non-Temporal Logical Data Model into a Temporal Physical Data Model 169 The Logical Data Model 169 Referential Constraints Between Non-Temporal and Bi-Temporal Tabl es 171 Asserted Versio ning Metadata 173 The Physical Data Model 180 Generating an Asserted Versioning Database from a Physical Data Model and Metadata 181 Tem poralizing the Physical Data Model 182 Generating Temporal Entity and Temporal Referential Integrity Const raints 185 Redundancies in the Asserted Versioning Bi-Temporal Schema 186 Apparent Redundancies in the Asserted Versioning Schema 186 A Real Redundancy in the Asserted Versioning Schema 188 Glossary References 189 An Asserted Versioning database is one that contains at least one asserted version table. An asserted version table is one whose schema is that shown in Chapter 6, and on which the two temporal integrity constraints are enforced. Figure 8.1 sh ows h ow Asserted Versioning databases are gener- ated from the combination of a conventional logical data model and a set of metadata entries. Note that the logical data model has no temporal features. This means that logical data models of conventional databases, developed perhaps years ago, do not have to be changed if a decision is made to convert one or more of the tables in those databases into bi-temporal asserted Managing Time in Relational Databases. Doi: 10.1016/B978-0-12-375041-9.00008-X Copyright # 2010 Elsevier Inc. All rights of reproduction in any form reserved. 167 version tables. This means that when building new logical data models, or extending old ones, data modelers can ignore tempo- ral requirements and focus on design issues which are often complex enough without introducing temporal considerations. It means that temporal requirements can be expressed declara- tively, in metadata associated with a conventional data model, rather than by hardcoding those requirements in the data model itself. This greatly simplifies the work of the data modeler. Her work, as far as temporality is concerned, is not to translate tem- poral requirements into data model constructs. Instead, it becomes that of simply expressing business requirements for temporal data as a set of metadata associated with the data model. As well as developing the logical model, the other task for the data modeler is to translate business requirements for temporal information into metadata. There are metadata entries for each table in the data model which is to be generated as an asserted version table. For these tables, there are entries to specify which business column or columns make up the business key for the table. This metadata also provides the information which the AVF needs to enforce temporal entity integrity and temporal referential integrity. Once the logical model and its associated metadata are com- plete, the next step is to generate a physical data model from the logical model. At this point, of course, the physical model that is Logical Data Model Temporal Requirements Physical Data Model Temporal Metadata An Asserted Versioning Database Non-Temporal Tables Asserted Versioning Tables TEI Enforcement TRI Enforcement Figure 8.1 Designing and Generating an Asserted Versioning Database. 168 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES generated has no temporal features; all of its tables are conven- tional non-temporal tables. The final step is a process in which a team consisting of the data modeler and a DBA uses the temporal metadata to modify the physical data model, changing specific tables into asserted version tables. In this process, pairs of date columns are added to implement assertion time and effective time. Surrogate pri- mary keys are created as object identifiers. Physical primary keys are converted into Asserted Versioning business keys, and physi- cal foreign keys into Asserted Versioning temporal foreign keys. However, for organizations using the ERwin data modeling tool, this manual process is unnecessary. In the first release of the AVF, we provide ERwin user-defined properties (UDPs) to hold all temporal metadata, and ERwin scripting macros which use these UDPs to generate a physical data model in which all the temporal conversion work has already been done. Note also that the Asserted Versioning database is more than a set of entries in a database catalog—more than the temporal data schemas shown in Figure 8.1. It is also the stored pro- cedures, triggers or other code that enforces temporal integrity constraints on temporal tables. In the Preface, we stated that Asserted Versioning simplifies the management of temporal databases by providing mainte- nance encapsulation, query encapsulation and design encapsu- lation. What we have just described here is how Asserted Versioning provides design encapsulation. In the rest of this chapter, we will see how design encapsulation works. Translating a Non-Temporal Logical Data Model into a Temporal Physical Data Model The Logical Data Model Figure 8.2 is the logical data model (LDM) of a sample data- base we have constructed, and which can be accessed at AssertedVersioning.com. The most important thing to notice about this LDM is that there is nothing special about it. In par- ticular, there is nothing explicitly temporal about it. And yet from this model, supplemented with metadata provided by the data modeler, the AVF will create an Asserted Versioning data- base in which all of the tables are bi-temporal tables. There may be other tables in an Asserted Versioning database whi ch are non-temp oral tables. But we are not concerned with them. The DBMS enforces entity integrity on them, while the Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 169 AVF enforces temporal entity integrity on its tables. The DBMS enforces referential integrity on them, while the AVF enforces temporal referential integrity on its tables. Later on, ad ditional non-temporal tables may be converted to asserted version tables, and this can be done without making any changes to the logical data models of those databases. Temporality is introduced “downstream” from the logical data models, by making entrie s in asserted version metadata tables, and then by modifying DDL in accordance with this metadata before that DDL is submitted to the DBMS. This particular logical data model is a simple one. In it, a cli- ent may own any number of policies, each of which must be owned by exactly one client. Each policy may be amended by any number of policy amendments, each of which amends exactly one policy. 1 A wellness program category categorizes any number of wellness programs, each of which is catego rized by exactly one wellness program category. A client may be client-nbr: CHAR(10) Client may own Policy policy-type: CHAR(3) copay-amt: MONEY client-nbr: CHAR(10) (FK) policy-nbr: CHAR(10) may be enrolled in may categorize Wellness-Program wellpgm-nbr: CHAR(10) wellpgmcat-cd: CHAR(4) (FK) wellpgm-nm: VARCHAR(50) may enroll client-nbr: CHAR(10) (FK) wellpgm-nbr: CHAR(10) (FK) wellpgm-enroll-begin-wgt: SMALLINT wellpgm-enroll-end-wgt: SMALLINT wellpgm-enroll-begin-a1c-nbr: DECIMAL(2,1) wellpgm-enroll-end-a1c-nbr: DECIMAL(2,1) Wellness-Program-Enrollment Wellness-Program-Category wellpgmcat-cd: CHAR(4) wellpgmcat-nm: VARCHAR(50) may be amended by Policy-Amendment policy-amend-nbr: CHAR(10) policy-nbr: CHAR(10) (FK) policy-amend-txt: VARCHAR(100) client-nm: VARCHAR(40) Figure 8.2 The Sample Database Logical Data Model. 1 “Any number of” is our substitute for the less graceful expression “zero, one or more”. The two expressions mean the same thing. 170 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES enrolled in any number of wellness programs, each of which may enroll any number of clients. Thus, the entity Wellness Program Enrollment is an associative entit y, implementing a many-to-many relationship between clients and programs. The business meaning of the entities, attributes and relation- ships should need no explanation, with the possible exception of the two attributes with a suffix of “a1c”. As all diabetics know, a1c is a blood test that measures what percentage of a person’s hemoglobin has glucose attached to it. As ERwin data modelers will immediately recognize, primary keys are shown above the horizontal line in each entity. Foreign keys, of course, have “(FK)” as a separate suffix. Since all of these entities will be generated as temporal tables, all these FKs will be replaced by temporal foreign keys, by TFKs. As we said earlier, the current implementation of Asserted Versioning uses ERwin’s user-defined properties to capture the metadata needed to generate a bi-temporal data base schema from a non-temporal data model. In this chapter, however, we will organize that metadata as a set of five metadata tables. Referential Constraints Between Non-Temporal and Bi-Temporal Tables There is nothing semantically wrong about a bi-temporal table being the child table in a referential integrity relationship. In that case, the bi-temporal table will contain a conventional foreign key which points to a row in a parent non-temporal table. Conversely, there is nothing semantically wrong about a non-temporal table being the child table in a temporal referen- tial integrity relationship. In that case, the non-temporal table will contain a temporal foreign key which points to an episode in a parent bi-temporal table. In both cases, the referential relationships reflect an existence dependency between the object s involved. When both tables are non-temporal, we represent that existence dependency as a ref- erential integrity dependency. When both tables are bi-temporal, we represent it as a temporal referential integrity dependency. When one table is non-temporal and the other bi-temporal, the existence dependency between their objects isn’t somehow nullified because of our choice of how to represent it. And so our managed objects should be able to express that dependency even in that “mixed” case. As bi-temporal theory, Asserted Versioning interprets non- temporal tables as tables whose rows are bi-temporal, but Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 171 implicitly so. Rows in non-temporal tables exist in an assertion time which is co-temporal with their physical presence, and so too for effective time. In other words, non-temporal rows are asserted for as long as they physically exist, and are versions which describe what their objects are currently like for as long as those rows physically exist. Their assertion time periods and their effective time periods are fixed; both are always [row create date – 12/31/9999]. In an alternative interpretation, non-temporal rows are asserted for as long as they physically exist in their current form , and are versions which describe their objects for as long as those rows physically exist in their current form. Each time a row is updated, its old form, i.e. an exact image of all of the data in that row, is lost because at least some of it is overwritten. In this interpretation, those rows must have a last update date, in which case their assertion time periods and their effective time periods are not fixed because both are [last update date – 12/31/9999]. In our initial release of the AVF, however, we will not support mixed referential relationships. One of these relationships won’t work, and the other one is dangerous. The relationship that won’t work is the one in which the child table is a non-tem poral table, and contains a tempo ral foreign key. This temporal foreign key is not declared in DDL because current DBMSs cannot rec- ognize it. This temporal foreign key cannot be managed by the DBMS because, unlike normal foreign keys, it does not point to a specific row in the parent table. The relationship that is dangerous is the one in which the child table is an asserted version table, and contains a conven- tional foreign key. This foreign key is declared in DDL, and the DBMS can recognize it. The danger lies in the fact that the DBMS can then carry out a delete cascade from the parent table to the child table, if it is so directed. This delete cascade, however, is unaware of the temporal semantics of the child table. It will simply find every physical row in that child table that contains the referenced foreign key value, and will then physically delete that row. This is meat cleaver work where delicate surgery is required. It can destroy past, current and future episodes in the child table, leaving col- lections of versions which are semantically invalid, and which the AVF will be unable to manage. It will physically remove both version history and assertion history, whereas bi-temporal data management is a promise to preserve both. The conventional delete set null rule woul d be a safer alternative because episode timelines would not be destroyed. Nonetheless, column-level history would still be lost. 172 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Mixed referential relationships should be addressed, but they will not be addressed in the first release of the AVF. And so, in the remainder of this chapter, and in most of the remainder of this book, we will not discuss them. Asserted Versioning Metadata Figures 8.3 through 8.7 show the metadata needed by the AVF to generate an Asserted Versioning database from the LDM shown in Figure 8.2. As with other figures showing tables, we indicate foreign keys by italicizing the column heading, and primary keys by underlining the column heading. We show these metadata tables as themselves conventional tabl es, and therefor e all relationships as ones implemented with conventional foreign keys. This simplifies the discussions in this chapter, and allows us to concentrate on the metadata without being concerned about keeping a bi-temporal history of changes to that data. Table Type Metadata In a logical data model that will generate an Asserted Versioning database, we need a metadata list of which entities to generate as non-temporal tables and which entities to gener- ate as asserted v ersion tables. This metadata table lists all the tables that will be generated as asserted version tables, as shown in Figure 8.3. For this data model, we will generate all its entities as asserted version tables. The non-key column in this metadata table is the business key flag. If it is set to ‘ Y’, then the table is considered to have a reliable business key. Otherwise, it is set to ‘N’, indicating that the business key for the table is not reliable. Client Y Y Y Y Y Y tbl-nm Table-Type bus-key- rlb-flag Policy Wellness_Program Wellness_Program_Category Wellness_Program_Enrollment Policy_Amendment Figure 8.3 The Table Type Metadata Table. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 173 [...]... datetime asr_end_dt: datetime row_crt_dt: datetime Wellness_Program_Category wellpgmcat_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime wellpgmcat_cd: char(4) epis_beg_dt: datetime wellpgmcat_nm: varchar(50) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Policy policy_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime policy_nbr: char(10) epis_beg_dt; datetime client_oid: bigint... Business Key Metadata Table Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES But in a temporal table, multiple rows may represent the same object, and so all of those rows will have the same business key Consequently, we cannot guarantee that each business key points to one and only one object by defining a unique index on it Nor can we simply extend the scope of the index by defining... between them or not By using the same granularity for all asserted version tables in the same database, it is easy to spot two versions of the same object that are contiguous in either assertion or in effective time Because of the closedopen convention, two time periods [meet] (are contiguous) if and only if the end point in time of one has the same value as the begin point in time of the other This... client_nbr: char(10) epis_beg_dt: datetime client_nm: varchar(40) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Wellness_Program_Enrollment client_wellpgm_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime client_oid: bigint wellpgm_oid: bigint epis_beg_dt: char(18) wellpgm_enroll_begin_wgt: smallint wellpgm_enroll_end_wgt: smallint wellpgm_enroll_begin_a1c_nbr: decimal(2,1) wellpgm_enroll_end_a1c_nbr:... by defining a unique index on them Nonetheless, they have an important role to play We discuss business keys, how the AVF’s enforcement of temporal entity integrity guarantees that no two objects will ever have the same business key, and how business keys help the business user clarify her intentions when submitting transactions to an Asserted Versioning database, in Chapter 9 Foreign Key Mapping Metadata... datetime asr_end_dt: datetime row_crt_dt: datetime Policy_Amendment policy-amend_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime policy_oid: bigint policy_amend_nbr: char(10) epis_beg_dt: datetime policy_amend_txt: varchar(100) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Figure 8.8 The Sample Database Physical Data Model Wellness_Program Wellpgm_oid: bigint eff_beg_dt: datetime... to Asserted Versioning’s assertion time And in our own prior implementations of bitemporal data management, we have used dates for effective time and microsecond timestamps for assertion time By using the same granularity for all assertion times in the same database, and the same granularity for all effective times, it is easy to determine the Allen relationship between any two time periods So suppose... are two time periods which start at the same time, one of which is delimited by dates and the other by timestamps The values, each of which designate the same point in time, are not identical But if the same granularity is used, the EQUALS operator will tell us whether or not those time periods begin at the same time Of particular importance is whether or not two time periods have a gap in time between... of business keys in asserted version tables is to identify the object represented by each row in the same way that object would be identified, or was identified, in a conventional table Most of the time, business keys are reliable In other words, most of the time, each business key value is a unique identifier for one and only one object So in a non-temporal table, it would be possible to define a... TFK in each row of the table must, at all times, contain an oid to an object one of whose episodes has an effective time period that includes ([fills-1]) the effective time period of the row that contains the TFK If the TFK is not required, the TFK in each row must either contain a valid oid reference, or be null In our sample database, we have made the TFK to Wellness 175 176 Chapter 8 DESIGNING AND . compare an assertion time period or point in time to an effective time period or point in time. One final point. We recommend that assertion time granular- ity. Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES CONTENTS Translating a Non-Temporal

Ngày đăng: 21/01/2014, 08:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan