An Introduction to Database Systems 8Ed - C J Date - Solutions Manual Episode 2 Part 8 doc

Copyright (c) 2003 C. J. Date page 25.6 "Is an object DBMS really a DBMS?" Self-explanatory. But the point, perhaps, is this: "Object DBMSs" do surely have a role to play; there are surely problems out there for which an "object DBMS" is the right solution. No argument here. No: The argument, rather, is simply that those "DBMSs" are not──for all kinds of reasons──DBMSs in the sense in which the database community understands and uses that term. It might have been better not to call them DBMSs. Reject the jingle "persistence orthogonal to type"! 25.6 Summary For this chapter, alone out of the whole book, it seems worth including most of the summary section in these notes, because it really serves not just as a summary per se but also as a critical analysis of the material discussed and as a lead-in to what might constitute a "good" object model. So here goes (the following is reworded just a little from the original): (Begin quote) • Object classes (i.e., types): Obviously essential (indeed, they're the most fundamental construct of all). • Objects: Objects themselves, both "mutable" and "immutable," are clearly essential──though I'd prefer to call them simply variables and values, respectively. * ────────── * Actually it might be argued that "mutable objects" aren't quite the same thing as variables in the classical sense. The one operator that must be available for a variable V is "assignment to V"──it's precisely the availability of that operator that makes V variable! But objects aren't required to have an associated assignment "method" (and indeed they typically don't); instead, such a method exists only if the class definer defines it. ────────── • Object IDs: Unnecessary, and in fact undesirable (at the model level, that is), because they're basically just pointers. Note too the argument, elaborated in the next chapter, that OIDs are fundamentally incompatible with a good model of inheritance. One problem──not the only one──is that Copyright (c) 2003 C. J. Date page 25.7 OIDs lead to the possibility of shared variables, a possibility that doesn't exist (nor do we want it to) in the relational world. Note: Two points arise here: 1. Since I first wrote that sentence about shared variables (in the Instructor's Manual for the seventh edition), the possibility in question has been introduced into the SQL world. I regard this state of affairs as further evidence that the relational world and the SQL world are not the same Worlds apart, in fact. 2. Don't fall into the trap of thinking that if two distinct tuples in a relational database contain the same foreign key value and thus reference the same target tuple, that target tuple is a "shared variable." It isn't. It isn't a variable at all, in fact (tuples are values). See further discussion in the next chapter. • Encapsulation: As explained in Section 25.2, "encapsulated" just means scalar, and I would prefer to use that term (always remembering that some "objects" aren't scalar anyway). • Instance variables: First, private instance variables are by definition merely implementation matters and hence not relevant to the definition of an abstract model, which is what we're concerned with here. Second, public instance variables don't exist in a pure object system and are thus also not relevant. I conclude that instance variables can be ignored; "objects" should be manipulable solely by "methods" (see below). • Containment hierarchy: We saw in Section 25.3 that containment hierarchies are misleading and in fact a misnomer, since they typically contain OIDs, not "objects." Note: A (nonencapsulated) hierarchy that really did include objects per se would be permissible, however, though usually contraindicated; it would be analogous, somewhat, to a relvar with relation-valued attributes (see Parts II and III of this book). Though we'd have to be careful yet again over the values vs. variables distinction • Methods: The concept is essential, of course, though I would prefer to use the more conventional term operators. * Bundling methods with classes is not essential, however, and leads to several problems [3.3]; I would prefer to define "classes" (types) and "methods" (operators) separately, as in Chapter 5, and thereby avoid the notion of "target objects" and "selfish methods." (It's worth noting, incidentally, that the problems Copyright (c) 2003 C. J. Date page 25.8 introduced by bundling are not just syntactic ones. Again, see reference [3.3].) ────────── * Another reason for avoiding the term "method" is that the term is used in the literature in two different senses: Sometimes it seems to mean the operator as seen by the user, sometimes it seems to mean the code that implements that operator. Yet another example of confusing model and implementation? ────────── There are certain operators I'd insist on, too: Selectors (which among other things effectively provide a way of writing literal values of the relevant type), THE_ operators, assignment and equality comparison operators, and type testing and TREAT DOWN operators (see Chapter 20). I reject "constructor functions," however. Constructors construct variables; since the only kind of variable we want in the database is, specifically, the relvar, the only "constructor" we need is an operator that creates a relvar (e.g., CREATE TABLE, in SQL terms). Selectors, by contrast, select values. Also, of course, constructors return pointers to the constructed variables, while selectors return the selected values per se. I would also stress the distinction between read-only and update operators (see Chapter 5). • Messages: Again, the concept is essential, though I'd prefer to use the more conventional term invocation (and, again, I'd avoid the notion that such invocations have to be directed at some "target object" but instead treat all arguments equally). • Class hierarchy (and related notions──inheritance, substitutability, inclusion polymorphism, and so on): Desirable but orthogonal (I see class hierarchy support, if provided, as just part of support for classes──i.e., types──per se). • Class vs. instance vs. collection: The distinctions are essential, of course, but orthogonal (the concepts are distinct, and that's really all that needs to be said). • Relationships: To repeat a point made earlier in these notes, it's not a good idea to treat "relationships" as a formally distinct construct──especially if it's only binary Copyright (c) 2003 C. J. Date page 25.9 relationships that receive such special treatment. I also don't think it's a good idea to treat the associated referential integrity constraints in some manner that's divorced from the treatment, if any, of integrity constraints in general (see below). • Integrated database programming language: Nice to have, but orthogonal. However, the languages actually supported in today's object systems are typically procedural (3GLs) and therefore──I would argue──nasty to have (another giant step backward, in fact). And here's a list of features that "the object model" typically doesn't support, or doesn't support well: • Ad hoc queries: Early object systems typically didn't support ad hoc queries at all. More recent systems do, but they do so, typically, either by breaking encapsulation or by imposing limits on the queries that can be asked * (meaning in this latter case that the queries aren't really ad hoc after all). ────────── * I.e., by restricting them, via path expressions, to predefined paths in the database──as in IMS. ────────── • Views: Typically not supported (for essentially the same reasons that ad hoc queries are typically not supported). Note: Some object systems do support "derived" or "virtual" instance variables (necessarily public ones); e.g., the instance variable AGE might be derived by subtracting the value of the instance variable BIRTHDATE from the current date. However, such a capability falls far short of a full view mechanism──and in any case I've already rejected the notion of public instance variables. • Declarative integrity constraints: Typically not supported (for essentially the same reasons that ad hoc queries and views are typically not supported). In fact, they're typically not supported even by systems that do support ad hoc queries. • Foreign keys: The "object model" has several different mechanisms for dealing with referential integrity, none of which is quite the same as the relational model's more uniform Copyright (c) 2003 C. J. Date page 25.10 foreign key mechanism. Such matters as ON DELETE RESTRICT and ON DELETE CASCADE are typically left to procedural code (probably methods, possibly application code). • Closure: What's (or, rather, where's) the object analog of the relational closure property? • Catalog: Where's the catalog in an object system? What does it look like? Are there any standards? Note: These questions are rhetorical, of course. What actually happens is that a catalog has to be built by the professional staff whose job it is to tailor the object DBMS for whatever application it has been installed for, as discussed at the end of Section 25.5. (That catalog will then be application-specific, as will the overall tailored DBMS.) To summarize, then, the good (essential, fundamental) features of the "object model"──i.e., the ones we really want to support──are as shown in the following table: ┌──────────────────┬─────────────────────┬───────────────────────┐ │ Feature │ Preferred term │ Remarks │ ├══════════════════┼─────────────────────┼───────────────────────┤ │ object class │ type │ scalar & nonscalar; │ │ │ │ possibly user-defined │ │ immutable object │ value │ scalar & nonscalar │ │ mutable object │ variable │ scalar & nonscalar │ │ method │ operator │ including selectors, │ │ │ │ THE_ ops, ":=", "=", │ │ │ │ & type test operators │ │ message │ operator invocation │ no "target" operand │ └──────────────────┴─────────────────────┴───────────────────────┘ (End quote) Answers to Exercises 25.1 We comment here on the term object itself (only; see the body of the chapter for the rest). Here are some "definitions" from the literature: • "Objects are reusable modules of code that store data, information about relationships between data and applications, and processes that control data and relationships" (from a commercial product announcement; this sentence is hard enough to parse, let alone understand). • "An object is a chunk of private memory with a public interface" (from reference [25.38]; the definition is true Copyright (c) 2003 C. J. Date page 25.11 enough, but hardly very precise; note too that it supports the position argued in reference [25.16] to the effect that the object model is really a storage model, not a data model). • "An object is an abstract machine that defines a protocol through which users of the object may interact" (from the introduction to reference [25.42]). • "An object is a software structure that contains data and programs" (from reference [25.24]; actually, objects don't contain programs, in general──class-defining objects contain programs). And my "favorite" (at the time of writing, at any rate) is this one: • "Object: A concrete manifestation of an abstraction; an entity with a well-defined boundary that encapsulates state and behavior; an instance of a class Instance: A concrete manifestation of an abstraction; an entity to which a set of operations can be applied and that has a state that stores the effects of the operations" (from reference [14.5]). * Note that none of these "definitions" gets to what we would regard as the heart of the matter──viz., that an object is essentially just a value (if immutable) or a variable (otherwise). ────────── * If object and instance mean the same thing, why are there two terms? If they don't, what's the difference? ────────── It's worth commenting too on the notion that "everything's an object." Here are some examples of constructs that aren't objects (at least, they aren't in most object systems): instance variables; relationships (at least in ODMG [25.11]); methods; OIDs; program variables. And in some systems (again including ODMG) values aren't objects either. 25.2 Some of the advantages of OIDs are as follows: • They aren't "intelligent." See reference [14.10] for an explanation of why this state of affairs is desirable. • They never change so long as the object they identify remains in existence. Copyright (c) 2003 C. J. Date page 25.12 • They're noncomposite. See references [14.11] and [19.8] for an explanation of why this state of affairs is desirable. • Everything in the database is identified in the same uniform way (contrast the situation with relational databases). • There's no need to repeat user keys in referencing objects. There's thus no need for any ON UPDATE rules. Some of the disadvantages──the fact that they don't avoid the need for user keys, the fact that they lead to a low-level pointer chasing style of programming, and the fact that they apply to "base" (nonderived) objects only──were discussed briefly in Sections 25.2-25.4. And the huge disadvantage, to the effect that they're incompatible with what I would regard as a "good" model of inheritance, is discussed in detail in the next chapter. Possible OID implementation techniques include: • Physical disk addresses (fast but poor data independence) • Logical disk addresses (i.e., page and offset addresses; fairly fast, better data independence) • Artificial IDs (e.g., timestamps, sequence numbers; need mapping to actual addresses) 25.3 See reference [25.15]. 25.4 No answer provided. 25.5 We don't give a detailed answer to this exercise, but we do offer a few comments on the question of object database design in general. It's sometimes claimed that object systems make database design (as well as database use) easier, because they provide high-level modeling constructs and support those constructs directly in the system. (By contrast, relational systems involve an extra level of indirection: namely, the mapping process from real-world objects to relvars, attributes, foreign keys, and so on.) And this claim does have some merit. However, it overlooks the larger question: How is object database design done in the first place? The fact is, "the object model" as usually understood involves far more degrees of freedom──in other words, more choices──than the relational model does; and I, at least, am not aware of any good guidelines that might help in making those choices. For example, how do we decide whether to represent, say, the set of all employees as an array, or a list, or a set (etc., etc.)? "A powerful data model needs a powerful design methodology Copyright (c) 2003 C. J. Date page 25.13 and this is a liability of the object model" (paraphrased somewhat from reference [25.24]; I would argue that that qualifier "powerful" should really be "complicated"). 25.6 No answer provided (it's straightforward, but tedious). 25.7 No answer provided (ditto). 25.8 No answer provided (ditto). 25.9 We don't give a detailed answer to this exercise, but we do make one remark concerning its difficulty. First, let's agree to use the term "delete" as a shorthand to mean "make a candidate for physical deletion" (i.e., by erasing all references to the object in question). Then in order to delete an object X, we must first find all objects Y that include a reference to X; for each such object Y, we must then either delete that object Y, or at least erase the reference in that object Y to the object X (by setting that reference to the special value (?) nil). And part of the problem is that it isn't possible to tell from the data definition alone exactly which objects include a reference to X, nor even how many of them there are. Consider employees, for example, and the object class ESET. In principle, there could be any number of ESET instances, and any subset of those ESET instances could include a reference to some specific employee. 25.10 There are at least nine possible hierarchies: S contains ( P contains ( J ) ) S contains ( J contains ( P ) ) S contains ( P and J ) P contains ( J contains ( S ) ) P contains ( S contains ( J ) ) P contains ( J and S ) J contains ( S contains ( P ) ) J contains ( P contains ( S ) ) J contains ( S and P ) "Which is best?" is unanswerable without additional information, but almost certainly all of them are bad. That is, whichever hierarchy is chosen, there'll always be numerous problems that are hard to solve in terms of that particular hierarchy. 25.11 First of all, there are the nine "obvious" designs discussed in the previous answer. But there are many other candidate designs as well──for example, an "SP" class that shows directly which suppliers supply which parts and also includes two embedded sets of projects, one for the supplier and one for the part. There's also a very simple design involving no (nontrivial) Copyright (c) 2003 C. J. Date page 25.14 hierarchies at all, consisting of an "SP" class, a "PJ" class, and a "JS" class. 25.12 The performance factors discussed were clustering, caching, pointer swizzling, and executing methods at the server. All of these techniques are applicable to any system that provides a sufficient level of data independence; they are thus not truly "object-specific." In fact, the idea of using the logical database definition to decide what physical clustering to use, as some object systems do, could be seen as potentially undermining data independence. Note: It should be pointed out too that another very important performance factor, namely optimization, typically does not apply to object systems. 25.13 Declarative support, if feasible, is always better than procedural support (for everything, not just integrity constraints). In a nutshell, as pointed out several times earlier in this manual (and in the book), declarative support means the system does the work instead of the user. That's why relational systems support declarative queries, declarative view definitions, declarative integrity constraints, and so on. 25.14 See the discussion of relationships in Section 25.5. *** End of Chapter 25 *** Copyright (c) 2003 C. J. Date page 26.1 Chapter 26 O b j e c t / R e l a t i o n a l D a t a b a s e s Principal Sections • The First Great Blunder • The Second Great Blunder • Implementation issues • Benefits of true rapprochement • SQL facilities General Remarks At first blush, this chapter might be thought a little lightweight (at least, until we get to the section on SQL). But there's a reason for this state of affairs! The fact is, the label "object/relational" is, primarily, vendor hype As the text asserts: A true "object/relational" system would be nothing more than a true relational system! For consider: • "Object/relational," if it means anything at all, has to mean marrying (good) object ideas with relational ideas. • We saw in Chapter 25 that "good object ideas" simply means proper data type support. • The relational model presupposes proper data type support (that's what domains are, data types, as we saw in Chapter 5). • So we don't have to do anything to the relational model──except implement it, an idea that doesn't seem to have been tried very much──in order to achieve the object functionality we desire. It follows that much of the stuff one might have been led by vendor hype to expect in this chapter──the stuff regarding user- defined types and type inheritance in particular (or "data blades," or "data cartridges," etc.)──has already been discussed earlier in the book. [...]... contains a dangling reference and deal with it appropriately) Copyright (c) 20 03 C J Date page 26 .8 26 .6 No answer provided (it's tedious but essentially "straightforward") 26 .7 No answer provided 26 .8 No answer provided *** End of Chapter 26 *** Copyright (c) 20 03 C J Date page 26 .9 Chapter 27 T h e W o r l d a n d W i d e W e b X M L Principal Sections • • • • • • The Web and the Internet An overview... to be any overwhelming reason to do so Not to mention the fact that any such technology would obviously suffer from problems similar to those that hierarchic database technology already suffers from (see, e.g., Chapter 13 of reference [1.5] or the annotation to references [27 .3] and [27 .6])." Note here the reference to hierarchic database technology, by the way XML documents are hierarchic; XML databases... too, and noncircular circles and the like can't occur (And run-time type errors specifically can occur only in the context of TREAT DOWN.) 26 .4 Yes and no (probably more no than yes) provided No further answer 26 .5 It might make sense, but the variable won't be automatically maintained (i.e., if the row the variable points to is deleted, it'll be up to the user to realize that the variable now contains... received, and processed on the Web like HTML.") ────────── * In this connection, see the annotation to reference [27 .3] Copyright (c) 20 03 C J Date page 27 .1 ────────── My own opinions regarding those "pretty stong claims" is summed up in the subsection "XML Databases" at the end of Section 27 .6 To quote: "[We] saw in Chapter 3 that the relational model is both necessary and sufficient to represent any... hierarchic; XML databases (by which I mean what are sometimes called "native" XML databases) are thus hierarchic databases, and all of the old arguments against hierarchic databases apply directly (just as they do to object databases, as discussed in Chapter 25 ) In this connection, see the annotation to reference [27 .6] The purpose of this chapter, then, is to try to get at the true nature of what... G, pages 42 1-4 22 VAR E ELLIPSE ; VAR XC REF _TO_ CIRCLE ; E := CIRCLE ( LENGTH ( 5.0 ), POINT ( 0.0, 0.0 ) ) ; XC := TREAT_DOWN_AS_REF _TO_ CIRCLE ( REF _TO ( E ) ) ; THE_A ( E ) := LENGTH ( 6.0 ) ; Ignoring irrelevancies, a relational analog of this example might look something like this: Copyright (c) 20 03 C J Date page 26 .7 VAR R1 RELATION { K ELLIPSE } KEY { K } ; VAR R2 RELATION { K CIRCLE } FOREIGN... world completely──all databases will become XML databases, SQL will disappear (or be subsumed by XML), the relational model just won't be relevant any more, and on and on Pretty strong claims for something that started out to be, in essence, nothing more than an approach to the data interchange problem! (To quote the XML specification [27 .25 ], the original purpose of XML was "to allow generic SGML to. .. years, and it's probably true to say that few products are actually adhering to it any more (in other words, I'd like to feel our arguments didn't completely fall on deaf ears) As already noted, however, just about every product on the market seems to be committing the second blunder!──in fact, it's at least arguable Copyright (c) 20 03 C J Date page 26 .2 that the SQL standard commits it (see Section 26 .6)... would be a shame to walk away from nearly 35 years of solid relational R&D Copyright (c) 20 03 C J Date page 26 .5 26 .6 SQL Facilities To quote: "SQL:1999's object/relational features are the most obvious and extensive difference between it and its predecessor SQL:19 92. " Remind students that: • SQL supports two kinds of user-defined types, DISTINCT types and structured types, both of which can be used as... relvar "object class" has public instance variables and only optionally has methods (it's definitely not "encapsulated") So one has A and not B, while the other has B and only optionally has A! Another logical difference • There's yet another huge logical difference between the column definitions "SAL NUMERIC" and "WORKS_FOR COMPANY": NUMERIC is a data type, COMPANY is a relvar • People who advocate the . (nontrivial) Copyright (c) 20 03 C. J. Date page 25 .14 hierarchies at all, consisting of an "SP" class, a "PJ" class, and a "JS" class. 25 . 12 The performance factors. too, and noncircular circles and the like can't occur. (And run-time type errors specifically can occur only in the context of TREAT DOWN.) 26 .4 Yes and no (probably more no than yes) well-defined boundary that encapsulates state and behavior; an instance of a class Instance: A concrete manifestation of an abstraction; an entity to which a set of operations can be applied and

An Introduction to Database Systems 8Ed - C J Date - Solutions Manual Episode 2 Part 8 doc

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan