SQL VISUAL QUICKSTART GUIDE- P7 docx

Foreign Keys Information about different entities is stored in different tables, so you need a way to navigate between tables. The relational model provides a mechanism called a foreign key to associate tables. A foreign key has these characteristics: ◆ It’s a column (or group of columns) in a table whose values relate to, or reference, values in some other table. ◆ It ensures that rows in one table have corresponding rows in another table. ◆ The table that contains the foreign key is the referencing or child table. The other table is the referenced or parent table. ◆ A foreign key establishes a direct relationship to the parent table’s primary key (or any candidate key), so foreign-key values are restricted to existing parent-key values. This constraint is called referential integrity. A particular row in a table appointments must have an associated row in a table patients , for example, or there would be appointments for patients who don’t exist or can’t be identified. An orphan row is a row in a child table for which no associated parent-table row exists. In a properly designed database, you can’t insert new orphan rows or make orphans out of existing child-table rows by deleting associated rows in the parent table. ◆ The values in the foreign key have the same domain as the parent key. Recall from “Tables, Columns, and Rows” earlier in this chapter that a domain defines the set of valid values for a column. ◆ Unlike primary-key values, foreign-key values can be null (empty); see the Tips in this section. ◆ A foreign key can have a different column name than its parent key. ◆ Foreign-key values generally aren’t unique in their own table. ◆ I’ve made a simplification in the first point: In reality, a foreign key can reference the primary key of its own table (rather than only some other table). A table employees with the primary key emp_id can have a foreign key boss_id , for example, that references the column emp_id . This type of table is called self-referencing. 40 Chapter 2 Foreign Keys Figure 2.9 shows a primary- and foreign-key relationship between two tables. After a foreign key is defined, your DBMS will enforce referential integrity. You can’t insert the following row into the child table titles , because the pub_id value P05 doesn’t exist in the parent table publishers : T07 I Blame My Mother P05 You can insert this row only if the foreign key accepts nulls: T07 I Blame My Mother NULL This row is legal: T07 I Blame My Mother P03 ✔ Tips ■ See also “Specifying a Foreign Key with FOREIGN KEY ” in Chapter 11. ■ SQL lets you specify the referential-integrity action that the DBMS takes when you attempt to update or delete a parent-table key value to which foreign-key values point; see the Tips in “Specifying a Foreign Key with FOREIGN KEY ” in Chapter 11. ■ Allowing nulls in a foreign-key column complicates enforcement of referential integrity. In practice, nulls in a foreign key often remain null temporarily, pend- ing a real-life decision or discovery; see “Nulls” in Chapter 3. 41 The Relational Model Foreign Keys pub_id pub name P01 Abatis Publishers P02 Core Dump Books P03 Schadenfreude Press P04 Tenterhooks Press Primary key Primary key publishers title_id title_name pub_id T01 1977! P01 T02 200 Years of Ger… P03 T03 Ask Your System… P02 T04 But I Did It Unco… P04 Foreign key titles P04Exchange of Plat…T05 P01How About Never?T06 Figure 2.9 The column pub_id is a foreign key of the table titles that references the column pub_id of publishers . Relationships A relationship is an association established between common columns in two tables. A relationship can be: ◆ One-to-one ◆ One-to-many ◆ Many-to-many One-to-one In a one-to-one relationship, each row in table A can have at most one matching row in the table B, and each row in table B can have at most one matching row in table A. Even though it’s practicable to store all the information from both tables in only one table, one-to-one relationships usually are used to segregate confidential information for security reasons, speed queries by splitting single monolithic tables, and avoid inserting nulls into tables (see “Nulls” in Chapter 3). A one-to-one relationship is established when the primary key of one table also is a foreign key referencing the primary key of another table (Figures 2.10 and 2.11). 42 Chapter 2 Relationships title_id advance T01 10000 T02 1000 T04 20000 royalties title_id title_name T01 1977! T02 200 Years of Ger… T03 Ask Your System… T04 But I Did It Unco… titles Figure 2.10 A one-to-one relationship. Each row in titles can have at most one matching row in royalties , and each row in royalties can have at most one matching row in titles . Here, the primary key of royalties also is a foreign key referencing the primary key of titles . titles title_id title_name royalties title_id advance Figure 2.11 This diagram shows an alternative way to depict the one-to-one relationship in Figure 2.10. The connecting line indicates associated columns. The key symbol indicates a primary key. One-to-many In a one-to-many relationship, each row in table A can have many (zero or more) matching rows in table B, but each row in table B has only one matching row in table A. A publisher can publish many books, but each book is published by only one publisher, for example. One-to-many relationships are established when the primary key of the one table appears as a foreign key in the many table (Figures 2.12 and 2.13). 43 The Relational Model Relationships pub_id pub name P01 Abatis Publishers P02 Core Dump Books P03 Schadenfreude Press P04 Tenterhooks Press publishers title_id title_name pub_id T01 1977! P01 T02 200 Years of Ger… P03 T03 Ask Your System… P02 T04 But I Did It Unco… P04 titles T05 Exchange of Plati… P04 Figure 2.12 A one-to-many relationship. Each row in publishers can have many matching rows in titles , and each row in titles has only one matching row in publishers . Here, the primary key of publishers (the one table) appears as a foreign key in titles (the many table). publishers pub_id pub_name titles title_id title_name pub_id Figure 2.13 This diagram shows an alternative way to depict the one-to-many relationship in Figure 2.12. The connecting line’s unadorned end indicates the one table, and the arrow indicates the many table. Many-to-many In a many-to-many relationship, each row in table A can have many (zero or more) matching rows in table B, and each row in table B can have many matching rows in table A. Each author can write many books, and each book can have many authors, for example. A many-to-many relationships is established only by creating a third table called a junction table, whose composite primary key is a combination of both tables’ primary keys; each column in the composite key separately is a foreign key. This technique always pro- duces a unique value for each row in the junction table and splits the many-to-many relationship into two separate one-to-many relationships (Figures 2.14 and 2.15). ✔ Tips ■ Joins (for performing operations on multiple tables) are covered in Chapter 7. ■ Yo u can establish a many-to-many relationship without creating a third table if you add repeating groups to the tables, but that method violates first normal form; see the next section. ■ A one-to-many relationship also is called a parent–child or master–detail relationship. ■ A junction table also is called an associating, linking, pivot, connection, or intersection table. 44 Chapter 2 Relationships title_id au_id T01 A01 T02 A01 T03 A05 T04 A03 T04 A04 T05 A04 title_id title_name T01 1977! T02 200 Years of Ger… T03 Ask Your System… T04 But I Did It Unco… T05 Exchange of Plati… titles au_id au_fname au_lname A01 Sarah Buchman A02 Wendy Heydemark A03 Hallie Hull A04 Klee Hull authors title_authors Figure 2.14 A many-to-many relationship. The junction table title_authors splits the many-to-many relationship between titles and authors into two one-to-many relationships. Each row in titles can have many matching rows in title_authors , as can each row in authors . Here, title_id in title_authors is a foreign key that references the primary key of titles , and au_id in title_authors is a foreign key that references the primary key of authors . titles title_id title_name title_authors title_id au_id authors au_id au_fname au_lname Figure 2.15 This diagram shows an alternative way to depict the many-to-many relationship in Figure 2.14. Normalization It’s possible to consolidate all information about books (or any entity type) into a single monolithic table, but that table would be loaded with duplicate data; each title (row) would contain redundant author, publisher, and royalty details. Redundancy is the enemy of database users and administrators: It causes databases to grow wildly large, it slows queries, and it’s a maintenance nightmare. (When someone moves, you want to change her address in one place, not thousands of places.) Redundancies lead to a variety of update anomalies—that is, difficulties with operations that insert, update, and delete rows. Normalization is the process—a series of steps—of modifying tables to reduce redundancy and inconsistency. After each step, the database is in a particular normal form. The relational model defines three normal forms, named after famous ordinal numbers: ◆ First normal form (1NF) ◆ Second normal form (2NF) ◆ Third normal form (3NF) Each normal form is stronger than its prede- cessors; a database in 3NF also is in 2NF and 1NF. Higher normalization levels tend to increase the number of tables relative to lower levels. Lossless decomposition ensures that table splitting doesn’t cause information loss, and dependency-preserving decomposition ensures that relationships aren’t lost. The matching primary- and foreign-key columns that appear when tables are split are not considered to be redundant data. Normalization is not systematic; it’s an iterative process that involves repeated table splitting and rejoining and refining until the database designer is (temporarily) happy with the result. 45 The Relational Model Normalization First normal form Atable in first normal form: ◆ Has columns that contain only atomic values and ◆ Has no repeating groups An atomic value, also called a scalar value, is a single value that can’t be subdivided (Figure 2.16). A repeating group is a set of two or more logically related columns (Figure 2.17). To fix these problems, store the data in two related tables (Figure 2.18). A database that violates 1NF causes problems: ◆ Multiple values in a row–column intersection mean that the combination of table name, column name, and key value is insufficient to address every value in the database. ◆ It’s difficult to retrieve, insert, update, or delete a single value (among many) because you must rely on the order of the values. ◆ Queries are complex (a performance killer). ◆ The problems that further normalization solves become unsolvable. 46 Chapter 2 Normalization title_id title_name authors T01 1977! A01 T04 But I Did It Unconsciously A03, A04 T11 Perhaps It's a Glandular Problem A03, A04, A06 Figure 2.16 In first normal form, each table’s row–column intersection must contain a single value that can’t be subdivided meaningfully. The column authors in this table lists multiple authors and so violates 1NF. title_id title_name author1 author2 author3 T01 1977! A01 T04 But I Did It Unconsciously A03 A04 T11 Perhaps It's a Glandular Problem A03 A04 A06 Figure 2.17 Redistributing the column authors into a repeating group also violates 1NF. Don’t represent multiple instances of an entity as multiple columns. Second normal form Before I give the constraints for second normal form, I’ll mention that a 1NF table automatically is in 2NF if: ◆ Its primary key is a single column (that is, the key isn’t composite) or ◆ All the columns in the table are part of the primary key (simple or composite) Atable in second normal form: ◆ Is in first normal form and ◆ Has no partial functional dependencies Atable contains a partial functional dependency if some (but not all) of a composite key’s values determine a nonkey column’s value. A 2NF table is fully functionally dependent, meaning that a nonkey column’s value might need to be updated if any column values in the composite key change. The composite key in the table in Figure 2.19 is title_id and au_id . The nonkey columns are au_order (the order in which authors are listed on the cover of a book with multiple authors) and au_phone (the author’s phone number). For each nonkey column, ask, “Can I determine a nonkey column value if I know only part of the primary-key value?” A no answer means the nonkey column is fully functionally dependent (good); a yes answer means that it’s partially functionally dependent (bad). 47 The Relational Model Normalization title_id au_id T01 A01 T04 A03 T04 A04 T11 A03 T11 A04 T11 A06 title_id title_name T01 1977! T04 But I Did It Unco… T11 Perhaps It's a Gla… Figure 2.18 The correct design solution is to move the author information to a new child table that contains one row for each author of a title. The primary key in the parent table is title_id , and the composite key in the child table is title_id and au_id . title_authors title_id au_id au_order au_phone Figure 2.19 au_phone depends on au_id but not title_id , so this table contains a partial functional dependency and isn’t in 2NF. Atomicity Atomic values are perceived to be indivisible from the point of view of database users. A date, a telephone number, and a character string, for example, aren’t really intrinsically indivisible because you can decompose the date into a year, month, and day; the phone number into a country code, area code, and subscriber number; and the string into its individual characters. What’s important as far as you’re con- cerned is that the DBMS provide operators and functions that let you extract and manipulate the components of “atomic” values if necessary, such as a substring() function to extract a telephone number’s area code or a year() function to extract a date’s year. For the column au_order , the questions are: ◆ Can I determine au_order if I know only title_id ? No, because there might be more than one author for the same title. ◆ Can I determine au_order if I know only au_id ? No, because I need to know the particular title too. Good— au_order is fully functionally dependent and can remain in the table. This dependency is written {title_id, au_id} ➝ {au_order} and is read “ title_id and au_id determine au_order ” or “ au_order depends on title_id and au_id .” The determinant is the expres- sion to the left of the arrow. For the column au_phone , the questions are: ◆ Can I determine au_phone if I know only title_id ? No, because there might be more than one author for the same title. ◆ Can I determine au_phone if I know only au_id ? Yes! The author’s phone number doesn’t depend upon the title. Bad— au_phone is partially functionally dependent and must be moved elsewhere (probably to an authors or phone_numbers table) to satisfy 2NF rules. 48 Chapter 2 Normalization titles title_id price pub_city pub_id Figure 2.20 pub_city depends on pub_id , so this table contains a transitive dependency and isn’t in 3NF. Third normal form Atable in third normal form: ◆ Is in second normal form and ◆ Has no transitive dependencies Atable contains a transitive dependency if a nonkey column’s value determines another nonkey column’s value. In 3NF tables, nonkey columns are mutually independent and dependent on only primary-key column(s). 3NF is the next logical step after 2NF. The primary key in the table in Figure 2.20 is title_id . The nonkey columns are price (the book’s price), pub_city (the city where the book is published), and pub_id (the book’s publisher). For each nonkey column, ask, “Can I determine a nonkey column value if I know any other nonkey column value?” A no answer means that the column is not transitively dependent (good); a yes answer means that the column whose value you can determine is transitively dependent on the other column (bad). For the column price , the questions are: ◆ Can I determine pub_id if I know price ? No. ◆ Can I determine pub_city if I know price ? No. For the column pub_city , the questions are: ◆ Can I determine price if I know pub_city ? No. ◆ Can I determine pub_id if I know pub_city ? No, because a city might have many publishers. For the column pub_id , the questions are: ◆ Can I determine price if I know pub_id ? No. ◆ Can I determine pub_city if I know pub_id ? Yes! The city where the book is published depends on the publisher. Bad— pub_city is transitively dependent on pub_id and must be moved elsewhere (probably to a publishers table) to satisfy 3NF rules. As you can see, it’s not enough to ask, “Can I determine A if I know B?” to discover a transitive dependency; you also must ask, “Can I determine B if I know A?” 49 The Relational Model Normalization . Blame My Mother P03 ✔ Tips ■ See also “Specifying a Foreign Key with FOREIGN KEY ” in Chapter 11. ■ SQL lets you specify the referential-integrity action that the DBMS takes when you attempt to update

SQL VISUAL QUICKSTART GUIDE- P7 docx

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Table of Contents

Introduction

About SQL

About This Book

What You’ll Need

Chapter 1: DBMS Specifics

Running SQL Programs

Microsoft Access

Microsoft SQL Server

Oracle

IBM DB2

MySQL

PostgreSQL

Chapter 2: The Relational Model

Tables, Columns, and Rows

Primary Keys

Foreign Keys

Relationships

Normalization

The Sample Database

Creating the Sample Database

Chapter 3: SQL Basics

SQL Syntax

Tài liệu cùng người dùng

Tài liệu liên quan