An Introduction to Database Systems 8Ed - C J Date - Solutions Manual Episode 2 Part 1 pps

Copyright (c) 2003 C. J. Date page 12.7 "update anomalies such as those discussed earlier in the chapter" don't occur. But others can! For example, deleting the tuple {S:Smith,J:Math,P:5} will "leave a gap," in the sense that now nobody comes 5th in the class list with respect to Math (in other words, a certain integrity constraint has been violated). The EXAM example thus clearly illustrates the point that not all update anomalies can be eliminated by normalization (i.e., by taking projections). In fact, of course, normalization can eliminate precisely those anomalies that are caused by FDs or MVDs or JDs that aren't implied by keys──just those anomalies and no others. 12.6 A Note on RVAs Possibly skip this section on a first pass. While RVAs are legal (see Chapter 6), they're usually contraindicated. (Of course, most textbooks──including earlier editions of this one──regard RVAs as illegal anyway. The section thus perhaps requires careful attention more by people who already know something about relational databases than it does by beginners.) If you do cover this material, certainly point out the asymmetry (fundamental problem) and mention predicate complexity. Here are the examples from the text. First, the (symmetric) queries── 1. Get S# for suppliers who supply part P1 2. Get P# for parts supplied by supplier S1 ──have very different formulations: 1. ( SPQ WHERE TUPLE { P# P# ('P1') } ε PQ { P# } ) { S# } 2. ( ( SPQ WHERE S# = S# ('S1') ) UNGROUP PQ ) { P# } Second, the (symmetric) updates── 1. Create a new shipment for supplier S6, part P5, quantity 500 2. Create a new shipment for supplier S2, part P5, quantity 500 ──look like this: 1. INSERT SPQ RELATION { TUPLE { S# S# ('S6'), PQ RELATION { TUPLE { P# P# ('P5'), QTY QTY ( 500 ) } } } } ; Copyright (c) 2003 C. J. Date page 12.8 2. UPDATE SPQ WHERE S# = S# ('S2') { INSERT PQ RELATION { TUPLE { P# P# ('P5'), QTY QTY ( 500 ) } } } ; Moreover, all of these formulations are significantly more complicated than their SP counterparts. RVAs are thus usually contraindicated in base relvars (i.e., in logical DB designs). This doesn't mean they're contraindicated in derived relations or relvars, or always contraindicated even in base relvars. By the way, relvar SPQ is in 5NF! (and thus certainly in BCNF). Answers to Exercises 12.1 Heath's theorem states that if R{A,B,C} satisfies the FD A → B (where A, B, and C are sets of attributes), then R is equal to the join of its projections R1 on {A,B} and R2 on {A,C}. In the following proof of this theorem, we adopt our usual informal shorthand for tuples. First we show that no tuple of R is lost by taking the projections and then joining those projections back together again. Let (a,b,c) ε R. Then (a,b) ε R1 and (a,c) ε R2, and so (a,b,c) ε R1 JOIN R2. Next we show that every tuple of the join is indeed a tuple of R (i.e., the join doesn't generate any "spurious" tuples). Let (a,b,c) ε R1 JOIN R2. In order to generate such a tuple in the join, we must have (a,b) ε R1 and (a,c) ε R2. Hence there must exist a tuple (a,b',c) ε R for some b', in order to generate the tuple (a,c) ε R2. We therefore must have (a,b') ε R1. Now we have (a,b) ε R1 and (a,b') ε R1; hence we must have b = b', because A → B. Hence (a,b,c) ε R. The converse of Heath's theorem would state that if R{A,B,C} is equal to the join of its projections on {A,B} and on {A,C}, then R satisfies the FD A → B. This statement is false. For example, Fig. 13.2 in the next chapter shows a relation that's certainly equal to the join of two of its projections and yet doesn't satisfy any (nontrivial) FDs at all. 12.2 The claim is almost but not quite valid. The following (pathological?) counterexample is taken from reference [6.5]. Consider the relvar Copyright (c) 2003 C. J. Date page 12.9 USA { COUNTRY, STATE } (interpreted as "STATE is part of COUNTRY," where COUNTRY is the United States of America in every tuple). Then the FD { } → COUNTRY holds in this relvar, and yet the empty set {} is not a candidate key. So USA isn't in BCNF (it can be nonloss-decomposed into its two unary projections──though whether it really should be further normalized in this way might be the subject of debate). 12.3 The figure below shows the most important FDs, both those implied by the wording of the exercise and those corresponding to reasonable semantic assumptions (stated explicitly below). The attribute names are intended to be self-explanatory. ╔════════════════════════════════════════════════════════════════╗ ║ ┌───────────┐ ┌───────────┐ ║ ║ │ AREA │ │ DBUDGET │ ║ ║ └─────*─────┘ └─────*─────┘ ║ ║ │ │ ║ ║ ┌─────┴─────┐ ┌─────┴─────┐ ┌───────────┐ ║ ║ │ OFF# ├───────────────────* DEPT# *──* MGR_EMP# │ ║ ║ └─────*─────*───────┐ ┌───────*─────*─────┘ └───────────┘ ║ ║ │ ┌────┼───┼────┐ │ ║ ║ ┌─────┴─────┐ │┌───┴───┴───┐│ ┌─────┴─────┐ ┌───────────┐ ║ ║ │ PHONE# *──┼┤ EMP# ├┼──* PROJ# ├──* PBUDGET │ ║ ║ └───────────┘ │└───────────┘│ └───────────┘ └───────────┘ ║ ║ ┌───────────┐ │┌───────────┐│ ┌───────────┐ ║ ║ │ JOBTITLE *──┤│ DATE │├──* SALARY │ ║ ║ └───────────┘ │└───────────┘│ └───────────┘ ║ ║ └─────────────┘ ║ ╚════════════════════════════════════════════════════════════════╝ Semantic assumptions: • No employee is the manager of more than one department at a time. • No employee works in more than one department at a time. • No employee works on more than one project at a time. • No employee has more than one office at a time. • No employee has more than one phone at a time. • No employee has more than one job at a time. Copyright (c) 2003 C. J. Date page 12.10 • No project is assigned to more than one department at a time. • No office is assigned to more than one department at a time. • Department numbers, employee numbers, project numbers, office numbers, and phone numbers are all "globally" unique. Step 0: Establish initial relvar structure Observe first that the original hierarchic structure can be regarded as a 1NF relvar DEPT0 with relation-valued attributes: DEPT0 { DEPT#, DBUDGET, MGR_EMP#, XEMP0, XPROJ0, XOFFICE0 } KEY { DEPT# } KEY { MGR_EMP# } Attributes DEPT#, DBUDGET, and MGR_EMP# are self-explanatory, but attributes XEMP0, XPROJ0, and XOFFICE0 are relation-valued and do require a little more explanation: • The XPROJ0 value within a given DEPT0 tuple is a relation with attributes PROJ# and PBUDGET. • Likewise, the XOFFICE0 value within a given DEPT0 tuple is a relation with attributes OFF#, AREA, and (say) XPHONE0, where XPHONE0 is relation-valued in turn. XPHONE0 relations have just one attribute, PHONE#. • Finally, the XEMP0 value within a given DEPT0 tuple is a relation with attributes EMP#, PROJ#, OFF#, PHONE#, and (say) XJOB0, where XJOB0 is relation-valued in turn. XJOB0 relations have attributes JOBTITLE and (say) XSALHIST0, where XSALHIST0 is once again relation-valued (XSALHIST0 relations have attributes DATE and SALARY). The complete hierarchy can thus be represented by the following nested structure: DEPT0 { DEPT#, DBUDGET, MGR_EMP#, XEMP0 { EMP#, PROJ#, OFF#, PHONE#, XJOB0 { JOBTITLE, XSALHIST0 { DATE, SALARY } } }, XPROJ0 { PROJ#, PBUDGET }, XOFFICE0 { OFF#, AREA, XPHONE0 { PHONE# } } } Note: Instead of attempting to show candidate keys, we've used italics here to indicate attributes that are at least "unique Copyright (c) 2003 C. J. Date page 12.11 within parent" (in fact, DEPT#, EMP#, PROJ#, OFF#, and PHONE# are, according to our stated assumptions, all globally unique). Step 1: Eliminate relation-valued attributes Now let's assume for simplicity that we wish every relvar to have a primary key specifically──i.e., we'll always designate one candidate key as primary for some reason (the reason isn't important here). In the case of DEPT0 in particular, let's choose {DEPT#} as the primary key (and so {MGR_EMP#} becomes an alternate key). We now proceed to get rid of all of the relation-valued attributes in DEPT0, since as noted in Section 12.6 such attributes are usually undesirable: * ────────── * We remark that the procedure given here for eliminating RVAs amounts to repeatedly executing the UNGROUP operator (see Chapter 7, Section 7.9) until the desired result is obtained. Incidentally, the procedure as described also guarantees that any multi-valued dependencies (MVDs) that aren't FDs are eliminated too; as a consequence, the relvars we eventually wind up with are in fact in 4NF, not just BCNF (see Chapter 13). ────────── • For each RVA in DEPT0──i.e., attributes XEMP0, XPROJ0, and XOFFICE0──form a new relvar with attributes consisting of the attributes from the underlying relation type, together with the primary key of DEPT0. The primary key of each such relvar is the combination of the attribute that previously gave "uniqueness within parent," together with the primary key of DEPT0. (Note, however, that many of those "primary keys" will include attributes that are redundant for unique identification purposes and will be eliminated later in the overall reduction procedure.) Remove attributes XEMP0, XPROJ0, and XOFFICE0 from DEPT0. • If any relvar R still includes any RVAs, perform an analogous sequence of operations on R. We obtain the following collection of relvars, with (as indicated) all RVAs eliminated. Note, however, that while the resulting relvars are necessarily in 1NF (of course), they aren't necessarily in any higher normal form. Copyright (c) 2003 C. J. Date page 12.12 DEPT1 { DEPT#, DBUDGET, MGR_EMP# } PRIMARY KEY { DEPT# } ALTERNATE KEY { MGR_EMP# } EMP1 { DEPT#, EMP#, PROJ#, OFF#, PHONE# } PRIMARY KEY { DEPT#, EMP# } JOB1 { DEPT#, EMP#, JOBTITLE } PRIMARY KEY { DEPT#, EMP#, JOBTITLE } SALHIST1 { DEPT#, EMP#, JOBTITLE, DATE, SALARY } PRIMARY KEY { DEPT#, EMP#, JOBTITLE, DATE } PROJ1 { DEPT#, PROJ#, PBUDGET } PRIMARY KEY { DEPT#, PROJ# } OFFICE1 { DEPT#, OFF#, AREA } PRIMARY KEY { DEPT#, OFF# } PHONE1 { DEPT#, OFF#, PHONE# } PRIMARY KEY { DEPT#, OFF#, PHONE# } Step 2: Reduce to 2NF We now reduce the relvars produced in Step 1 to an equivalent collection of relvars in 2NF by eliminating any FDs that aren't irreducible. We consider the relvars one by one. DEPT1: This relvar is already in 2NF. EMP1: First observe that DEPT# is actually redundant as a component of the primary key for this relvar. We can take {EMP#} alone as the primary key, in which case the relvar is in 2NF as it stands. JOB1: Again, DEPT# isn't needed as a component of the primary key. Since DEPT# is functionally dependent on EMP#, we have a nonkey attribute (DEPT#) that isn't irreducibly dependent on the primary key (the combination {EMP#,JOBTITLE}), and hence JOB1 isn't in 2NF. We can replace it by JOB2A { EMP#, JOBTITLE } PRIMARY KEY { EMP#, JOBTITLE } and JOB2B { EMP#, DEPT# } PRIMARY KEY { EMP# } Copyright (c) 2003 C. J. Date page 12.13 However, JOB2A is a projection of SALHIST2 (see below), and JOB2B is a projection of EMP1 (renamed as EMP2 below), so both of these relvars can be discarded. SALHIST1: As with JOB1, we can project away DEPT# entirely. Moreover, JOBTITLE isn't needed as a component of the primary key; we can take the combination {EMP#,DATE} as the primary key, to obtain the 2NF relvar SALHIST2 { EMP#, DATE, JOBTITLE, SALARY } PRIMARY KEY { EMP#, DATE } PROJ1: As with EMP1, we can consider DEPT# as a nonkey attribute; the relvar is then in 2NF as it stands. OFFICE1: Similar remarks apply. PHONE1: We can project away DEPT# entirely, since the relvar (DEPT#,OFF#) is a projection of OFFICE1 (renamed as OFFICE2 below). Also, OFF# is functionally dependent on PHONE#, so we can take {PHONE#} alone as the primary key, to obtain the 2NF relvar PHONE2 { PHONE#, OFF# } PRIMARY KEY { PHONE# } Note that this relvar isn't necessarily a projection of EMP2 (phones or offices might exist without being assigned to employees), so we can't discard it. Hence our collection of 2NF relvars is DEPT2 { DEPT#, DBUDGET, MGR_EMP# } PRIMARY KEY { DEPT# } ALTERNATE KEY { MGR_EMP# } EMP2 { EMP#, DEPT#, PROJ#, OFF#, PHONE# } PRIMARY KEY { EMP# } SALHIST2 { EMP#, DATE, JOBTITLE, SALARY } PRIMARY KEY { EMP#, DATE } PROJ2 { PROJ#, DEPT#, PBUDGET } PRIMARY KEY { PROJ# } OFFICE2 { OFF#, DEPT#, AREA } PRIMARY KEY { OFF# } PHONE2 { PHONE#, OFF# } PRIMARY KEY { PHONE# } Copyright (c) 2003 C. J. Date page 12.14 Step 3: Reduce to 3NF Now we reduce the 2NF relvars to an equivalent 3NF set by eliminating transitive FDs. The only 2NF relvar not already in 3NF is the relvar EMP2, in which OFF# and DEPT# are both transitively dependent on the primary key {EMP#}──OFF# via PHONE#, and DEPT# via PROJ# and also via OFF# (and hence via PHONE#). The 3NF relvars (projections) corresponding to EMP2 are EMP3 { EMP#, PROJ#, PHONE# } PRIMARY KEY { EMP# } X { PHONE#, OFF# } PRIMARY KEY { PHONE# } Y { PROJ#, DEPT# } PRIMARY KEY { PROJ# } Z { OFF#, DEPT# } PRIMARY KEY { OFF# } However, X is PHONE2, Y is a projection of PROJ2, and Z is a projection of OFFICE2. Hence our collection of 3NF relvars is simply DEPT3 { DEPT#, DBUDGET, MGR_EMP# } PRIMARY KEY { DEPT# } ALTERNATE KEY { MGR_EMP# } EMP3 { EMP#, PROJ#, PHONE# } PRIMARY KEY { EMP# } SALHIST3 { EMP#, DATE, JOBTITLE, SALARY } PRIMARY KEY { EMP#, DATE } PROJ3 { PROJ#, DEPT#, PBUDGET } PRIMARY KEY { PROJ# } OFFICE3 { OFF#, DEPT#, AREA } PRIMARY KEY { OFF# } PHONE3 { PHONE#, OFF# } PRIMARY KEY { PHONE# } Finally, it's easy to see that each of these 3NF relvars is in fact in BCNF. Note that, given certain (reasonable) additional semantic constraints, this collection of BCNF relvars is strongly redundant [6.1], in that the projection of relvar PROJ3 over {PROJ#,DEPT#} Copyright (c) 2003 C. J. Date page 12.15 is at all times equal to a projection of the join of EMP3 and PHONE3 and OFFICE3. Observe finally that it's possible to "spot" the BCNF relvars from the FD diagram (how?). Answer: Loosely, there'll be one such relvar for each box that has an arrow emerging from it; that relvar will include the attributes from that original box as a candidate key, together with an attribute for every box pointed to from the original box (and no other attributes). Of course, some refinement is needed to this loose statement in order to take care of relvars like DEPT3 that have two or more candidate keys. Note: We don't claim that it's always possible to "spot" a BCNF decomposition──only that it's often possible to do so in practical cases. To revert to the company database example: As a subsidiary exercise──not much to do with normalization as such, but very relevant to database design in general──try extending the foregoing design to incorporate the necessary foreign key specifications as well. Answer: DEPT3 { DEPT#, DBUDGET, MGR_EMP# } PRIMARY KEY { DEPT# } ALTERNATE KEY { MGR_EMP# } FOREIGN KEY { RENAME MGR_EMP# AS EMP# } REFERENCES EMP3 EMP3 { EMP#, PROJ#, PHONE# } PRIMARY KEY { EMP# } FOREIGN KEY { PROJ# } REFERENCES PROJ3 FOREIGN KEY { PHONE# } REFERENCES PHONE3 SALHIST3 { EMP#, DATE, JOBTITLE, SALARY } PRIMARY KEY { EMP#, DATE } FOREIGN KEY { EMP# } REFERENCES EMP3 PROJ3 { PROJ#, DEPT#, PBUDGET } PRIMARY KEY { PROJ# } FOREIGN KEY { DEPT# } REFERENCES DEPT3 OFFICE3 { OFF#, DEPT#, AREA } PRIMARY KEY { OFF# } FOREIGN KEY { DEPT# } REFERENCES DEPT3 PHONE3 { PHONE#, OFF# } PRIMARY KEY { PHONE# } FOREIGN KEY { OFF# } REFERENCES OFFICE3 12.4 The figure below shows the most important FDs for this exercise. The semantic assumptions are as follows: ╔════════════════════════════════════════════════════════════════╗ Copyright (c) 2003 C. J. Date page 12.16 ║ ┌───────────┐ ║ ║ ┌────────* BAL │ ║ ║ │ └───────────┘ ║ ║ ┌───────────┐ ┌─────┴─────┐ ┌───────────┐ ║ ║ │ ADDRESS ├───* CUST# ├──* CREDLIM │ ║ ║ └─────*─────┘ └─────┬─────┘ └───────────┘ ║ ║ │ │ ┌───────────┐ ║ ║ │ └────────* DISCOUNT │ ║ ║ ┌──────┼──────┐ └───────────┘ ║ ║ ┌───────────┐ │┌─────┴─────┐│ ┌───────────┐ ║ ║ │ QTYORD *──┤│ ORD# ├┼──* DATE │ ║ ║ └───────────┘ │└───────────┘│ └───────────┘ ║ ║ ┌───────────┐ │┌───────────┐│ ║ ║ │ QTYOUT *──┤│ LINE# ││ ║ ║ └───────────┘ │└───────────┘│ ║ ║ └──────┬──────┘ ║ ║ ┌──────┼──────┐ ║ ║ ┌───────────┐ │┌─────*─────┐│ ┌───────────┐ ║ ║ │ DESCN *──┼┤ ITEM# │├──* QTYOH │ ║ ║ └───────────┘ │└───────────┘│ └───────────┘ ║ ║ │┌───────────┐│ ┌───────────┐ ║ ║ ││ PLANT# │├──* DANGER │ ║ ║ │└───────────┘│ └───────────┘ ║ ║ └─────────────┘ ║ ╚════════════════════════════════════════════════════════════════╝ • No two customers have the same ship-to address. • Each order is identified by a unique order number. • Each detail line within an order is identified by a line number, unique within the order. An appropriate set of BCNF relvars is as follows: CUST { CUST#, BAL, CREDLIM, DISCOUNT } KEY { CUST# } SHIPTO { ADDRESS, CUST# } KEY { ADDRESS } ORDHEAD { ORD#, ADDRESS, DATE } KEY { ORD# } ORDLINE { ORD#, LINE#, ITEM#, QTYORD, QTYOUT } KEY { ORD#, LINE# } ITEM { ITEM#, DESCN } KEY { ITEM# } [...]... and ZCS are not independent in Rissanen's sense [ 12 . 6] Note: We saw in the answer to Exercise 11 .15 that in fact the FD ZIP → { CITY, STATE } does not hold in practice As a subsidiary exercise, therefore, revise your answer to Exercise 12 . 7 to take this fact into account Answer: We don't give a full answer here, but remark that the techniques illustrated in the answer to Exercise 12 . 5 are relevant 12 . 8... projections we choose for the first join, though the intermediate result is different in each case Exercise: Check this claim." Answer: • SP JOIN PJ yields the spurious tuple (S2,P1 ,J2 ); there's no (J2 ,S2) tuple in JS; hence the final result is SPJ (as we've already seen) • PJ JOIN JS yields the spurious tuple (S2,P2 ,J1 ); there's no (S2,P2) tuple in SP; hence the final result is SPJ again • JS JOIN... the spurious tuple (S1,P2 ,J2 ); there's no (P2 ,J2 ) tuple in PJ; hence the final result is SPJ once again Copyright (c) 20 03 C J Date page 13 .4 The section also includes the following: "We've seen that relvar SPJ, with its JD *{SP,PJ,JS}, can be 3-decomposed The question is, should it be? And the answer is probably yes Relvar SPJ (with its JD) suffers from a number of problems over update operations, problems... (c) 20 03 C J Date page 13 .2 ( HCTX UNGROUP TEXTS ) UNGROUP TEACHERS In other words, the ungroupings can be done in either order 13 .3 JDs and 5NF Again mostly self-explanatory to note: Possible points for the instructor • The "cyclic constraint" stuff (this is a helpful intuitive characterization of a relvar that's in 4NF but not 5NF; perhaps mention that such constraints seem to be rare in practice?──see... original approach (shown as the answer to the previous exercise), the DISCOUNT attribute would have to be moved to the SHIPTO relvar, making processing still more complicated With the revised approach, however, the primary discount (corresponding to the primary address) can be represented by an appearance of DISCOUNT in CUST, and secondary discounts by a corresponding appearance of DISCOUNT in SECOND Both... is OK THEN process the order ; END IF ; The advantages of this approach include the following: Copyright (c) 20 03 C J Date 12 . 17 page • Processing is simpler (and possibly more efficient) for 99 percent of customers • If the ship -to address is omitted from the input order, the primary address could be used by default • Suppose the customer can have a different discount for each ship -to address With... always achievable • In practice, the first step in eliminating RVAs should be to separate them out into separate relvars E.g., starting with HCTX, split into HCT { COURSE, TEACHERS } and HCX { COURSE, TEXTS }; then ungrouping HCT and HCX will take us straight to CT and CX (See the discussion of Answer 12 . 3 in the previous chapter.) • In practice, if a relvar is in BCNF, it's almost certainly in 4NF too... deficiencies, you should denormalize only as a last resort.) 2 The original relvar should be reconstructable by joining the projections back together again (The decomposition must be nonloss.) 3 The decomposition process should preserve dependencies (Preferably decompose into independent projections──though as we know, this objective and the objective of decomposing to 5NF, or even just to BCNF, can... ship -to address as that customer's primary address For the 99 percent, of course, the primary address is the only address Any other addresses we refer to as secondary Relvar CUST can then be redefined as CUST { CUST#, ADDRESS, BAL, CREDLIM, DISCOUNT } KEY { CUST# } and relvar SHIPTO can be replaced by SECOND { ADDRESS, CUST# } KEY { ADDRESS } Here CUST contains the primary address, and SECOND contains... ITEM#, PLANT#, QTYOH, DANGER } KEY { ITEM#, PLANT# } 12 . 5 Consider the processing that must be performed by a program handling orders We assume that the input order specifies customer number, ship -to address, and details of the items ordered (item numbers and quantities) retrieve CUST WHERE CUST# = input CUST# ; check balance, credit limit, etc ; retrieve SHIPTO WHERE ADDRESS = input ADDRESS AND CUST# . and JOB2B { EMP#, DEPT# } PRIMARY KEY { EMP# } Copyright (c) 20 03 C. J. Date page 12 . 13 However, JOB2A is a projection of SALHIST2 (see below), and JOB2B is a projection of EMP1. Copyright (c) 20 03 C. J. Date page 12 . 20 fact, of course, relvars NSZ and ZCS are not independent in Rissanen's sense [ 12 . 6]. Note: We saw in the answer to Exercise 11 .15 that in fact the. independent projections──though as we know, this objective and the objective of decomposing to 5NF, or even just to BCNF, can unfortunately be in conflict.) Copyright (c) 20 03 C. J. Date page 13 .6

An Introduction to Database Systems 8Ed - C J Date - Solutions Manual Episode 2 Part 1 pps

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan