Tài liệu Module 9: Data Storage Considerations pdf

28 465 0
Tài liệu Module 9: Data Storage Considerations pdf

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Module 9: Data Storage Considerations THIS PAGE LEFT INTENTIONALLY BLANK Module 9: Data Storage Considerations Module 1: Course Overview Module 8: Designing Data Services Module 7: Implementing Data Integrity Module 2: Solution Design Processes Designing Data Services and Data Models Module 6: Deriving a Physical Data Design Module 3: Using a Conceptual Design for Data Requirements Module 4: Deriving a Logical Data Design Module 5: Normalizing the Logical Data Design Choosing a Database Product Module 9: Data Storage Considerations Activity 9.1: Data Quiz Data Storage Considerations Data Storage Technologies Module 9: Data Storage Considerations 195 ! Overview Slide Objective To provide an overview of this module’s topics and objectives " " Data Storage Considerations " Activity 9.1: Data Quiz " In this module, you will look at data storage considerations and data storage technologies Data Storage Technologies " Lead-in Choosing a Database Product Review In this module In this module At the end of this module, you will be able to: " Identify different types of hardware and software for implementing a data store " Choose the appropriate hardware or software for implementing a data store 196 Module 9: Data Storage Considerations ! Choosing a Database Product Slide Objective To provide an overview of this section Lead-in This section discusses some of the factors to consider when choosing a database product " User Community " System Performance " System Maintenance In this section In this section In this section, you will learn about several factors that you should consider when selecting a database product Module 9: Data Storage Considerations 197 User Community Slide Objective To explain how the number of users affects a database decision Lead-in When selecting a database technology, you should consider how many users the system will have to support " Total number of users " Total number of concurrent users " Scalability for current and future requirements " Security One important consideration when examining database technologies is the user community—the number of users who need to be able to use the database Users are not only the people, but also the other applications that can interactively access the database If a different application or system needs to access a database, its impact on the database must be taken into account The total pool of potential users directly impacts a data store’s ability to support an application The number of actual users concurrently accessing the system at any time is also extremely important Generally, most database systems provide concurrent, multiuser capabilities, and most multiuser data stores are tuned to support a specific range of concurrent users How many concurrent users a system can support, as well as how performance might degrade as the number of concurrent users increases, is typically referred to as a database system’s scalability When identifying concurrent usage patterns, you should identify the average and peak numbers of concurrent users for the application These numbers will also translate into the number of concurrent users for the data store itself They can be one-to-one ratios, as with most client/server applications However, distributed application designs typically decrease the number of concurrent data users per concurrent application user because not all users access the data server every time they use the application It is also important to expect and plan for the total number of users to grow over time The application design and the choice of data stores should reflect the expected growth in application use Finally, most systems provide some level of user security Security can be categorized as authentication, access, encryption, or auditing, depending on whether data stores provide services to identified users, allow access to particular data elements, encrypt critical information, or keep a record of transactions 198 Module 9: Data Storage Considerations System Performance Slide Objective To discuss the performance considerations that can influence the selection of a database technology " $ " Retrieve $ Lead-in Performance is an important aspect to consider when selecting a database technology Access speed Create, update, and delete Availability $ Maintaining system uptime $ Recovery Because most applications rely heavily on data, the performance speed of a data store directly impacts the perceived performance of the application You can examine data storage performance in two separate speed-related contexts: in retrieving data (performing queries) and in maintaining data Regarding data retrieval performance, the speed of queries is important if the application is required to present a large amount of information in a short period of time For example, a database application that provides product listings on a public Web page must be able to quickly perform a large number of queries Regarding data maintenance performance, some applications modify data frequently, such as creating, updating, and deleting records For example, an application that primarily performs order entry for a catalog company requires high performance for creating data Different data stores will provide various mechanisms to allow you to tune your application for its primary speed area As discussed previously, the underlying database design can also significantly impact these performance areas In addition to a data store’s speed, a data store’s performance is related to its availability Availability is the time that the data store is available to applications Availability is a function of the actual time that a data store is running and the time it takes to repair and get the database running again if it crashes If an application relies heavily on a data store that is not functioning, the data store has an availability problem Some data stores provide availability services such as clustering, transactions, and online backup and restore capabilities that increase their uptime and decrease the time it takes to recover from a crash Module 9: Data Storage Considerations 199 System Maintenance Slide Objective To introduce the maintenance issues that can affect your decision about a database technology Lead-in It is important to evaluate maintenance capabilities when selecting the data store for an application " Administration " Optimization " Locking granularity " Transactional support (ACID) $ Atomicity $ Consistency $ Isolation $ Durability Every application requires some level of data store administration Many data stores provide strong administrative support across slower remote connections, whereas other data stores require a LAN-speed connection to perform adequately Administrative tasks might include modifying tables, queries, or fields, as well as optimization techniques such as indexes, archiving, and organizing data values within tables In particular, the optimization capabilities of a data store can be a strong selection criteria For example, you will need to know what types of optimization indexes can be created, how quickly indexes can be rebuilt, and whether the optimizations can occur while the data store is available The data store’s ability to automate optimization processes, as well as to provide statistics to determine good optimization techniques, can also be determining factors in your choice of a data store How an application maintains data values is an additional consideration for selecting a data store A data store’s granularity of control when its existing records are updated can increase or decrease the difficulty encountered with multiuser systems These difficulties typically arise as many people try to access and update the same value, record, or neighboring records The data store’s database locking mechanism typically controls these simultaneous update attempts Some databases not provide locking mechanisms, some provide them on blocks of records known as pages, and others provide locking capabilities at a record or even field level Often the data store’s locking capabilities will simplify the application development process, as well as improve the data store’s performance 200 Module 9: Data Storage Considerations In addition to providing locking mechanisms to improve the data modification process, many databases have built-in support for transactions, and as a result, allow for easier development of transactional systems A transactional database should support the following ACID principles for transactions: " Atomicity Atomicity is the process by which a transaction either commits or aborts If a transaction commits, then every transaction component is completed If a transaction is aborted, every operation in the transaction is rolled back and undone " Consistency Updates to a system are performed in such a way that data consistency is preserved " Isolation Each transaction is isolated from every other transaction An aborted transaction should not affect a separate transaction occurring at the same time as the aborted transaction " Durability Durability is the process by which a committed transaction remains in the system, regardless of subsequent activity or even system failure In addition to these maintenance issues, you should consider the database product’s ability to interact with transactions outside the database itself (for example, with standardized transaction systems such as Microsoft® Transaction Server [MTS]) Module 9: Data Storage Considerations 201 ! Data Storage Technologies Slide Objective To provide an overview of data storage technologies Lead-in In this section, you will learn about some of the Microsoft products available for implementing databases " Microsoft Excel " Microsoft Jet " Microsoft Visual FoxPro " MSDE " SQL Server In this section In this section In this section, you will learn about the major features of some of the Microsoft products available for implementing databases 202 Module 9: Data Storage Considerations Microsoft Excel Slide Objective To explain the purpose and characteristics of Excel for data storage Lead-in Although Excel is widely used as a spreadsheet application, it also provides limited data storage " Used for querying tables " Basic database functionality " Quick and simple to implement Although not explicitly a database application, Microsoft Excel contains some potentially useful database functionality, such as its use as a querying tool for databases Also, Excel can connect to other databases and import data from them The imported data can then be used to create reports and graphs, to determine whether the data in the database is correct, and to conduct data analysis Excel spreadsheets can be used as intelligent database tables; they can store data and provide a visual analysis of the data However, this database functionality is limited because only one user at a time can edit the spreadsheet, and the tables are not relational Additionally, for a single-user system, Excel provides limited data storage capacities Excel provides Microsoft Visual Basic® for Applications programming capabilities, but it does not provide transactional support or any significant data maintenance features such as crash recovery or transaction logging 206 Module 9: Data Storage Considerations " Transaction logs, which exist in case problems such as disk errors and network or power failures occur during a write to an MSDE database MSDE can recover its last consistent state from its transaction log and revert to that state Module 9: Data Storage Considerations 207 SQL Server Slide Objective To explain the characteristics of SQL Server to consider when deciding on a storage technology " Is a highly scalable, 32-bit, client/server database " Is recommended for extremely large data sets and mission-critical applications " Provides connectivity to a wide variety of clients and data stores " Has a rich development environment " Supports ANSI SQL-92 standards Lead-in SQL Server is Microsoft’s most robust database application SQL Server 7.0 is Microsoft’s mission-critical client/server database engine It can run on a laptop as well as on a server computer SQL Server supports data queries, importing and extraction from a wide variety of clients and third-party data stores, and data warehousing In addition to its high-capacity database engine, SQL Server 7.0 offers a wide variety of features, including the following: " Replication of database information to other SQL Servers, including MSDE, is supported, which helps improve performance and fault tolerance " An OLAP tool accompanying SQL Server aids in studying data trends and producing multidimensional data views This tool is helpful for executive information systems, as well as for other systems that depend on multiple views of identical data " SQL Server supports a large number of concurrent users, primarily limited by the size of the servers and network bandwidth This level of support is a default feature of SQL Server and does not have to be added or managed with separate multiuser logic " Databases in the multiple terabyte range are supported and limited only by hardware restrictions " Built-in transactional support and transaction logging capabilities provide strong data protection and recovery SQL Server is recommended when homogeneous connectivity and scalability to different types of hardware is necessary Additionally, its server-based nature provides excellent support for remote connectivity and maintenance across slow network connections 208 Module 9: Data Storage Considerations ! Data Storage Considerations Slide Objective To provide an overview of this section Lead-in This section discusses some data storage considerations that impact database implementation " Backup and Restore " RAID Technologies " Clustering " Disaster Recovery In this section In this section In this section, you will learn about factors to consider when implementing a database These considerations are critical to the availability of both applications and the data that they manipulate Module 9: Data Storage Considerations 209 Backup and Restore Slide Objective To explain the need for a good backup plan for databases Lead-in A good backup and restore plan is important for any database system These two processes are responsible for providing valid data in case of an emergency " Cost of downtime " Time to back up data " Time to restore data " Backup and restore testing Backing up data is a key management function for databases or any other type of information system Failures of one type or another are inevitable They can be caused by hardware or software problems, malicious attacks, viruses, or natural disasters Implementing a solid backup and restore strategy is the only way to ensure that data is available when needed When examining backup and restore solutions, you should first look at the amount of time each solution takes to back up and restore data, especially considering the potential size of many databases For example, assume that the Ferguson and Bardell, Inc database is 40 GB Ferguson and Bardell, Inc has a time window of six hours to back up its database The backup for the entire database starts each night at midnight The database backup must be complete by 6:00 A.M, when the morning shift begins work As a result of the technology that Ferguson and Bardell, Inc has chosen, completing the backup takes longer than six hours Obviously, Ferguson and Bardell, Inc.’s solution will not adequately accommodate the morning staff The time it takes to restore data can be even more critical than the time it takes to back it up Backing up data can often occur at off-peak times because the data is still available for use even when it is being backed up In a situation in which data needs to be restored quickly, however, the data or database is often unavailable, and the time necessary to restore data is directly proportional to the time it takes to locate the missing data and restore it from the backup device This situation can lead to downtime that can cost an organization time and money In addition to planning and implementing a backup and restore strategy, you should be sure to test it Some organizations assume that they are backing up data properly, only to discover when they attempt to restore their data that the backup has failed Periodic data-restoration testing ensures that your strategy works properly 210 Module 9: Data Storage Considerations As mentioned with each technology, nontransactional systems, such as Excel, FoxPro, and Jet, provide support only for recovering the entire database, whereas transaction logging systems, such as MSDE and SQL Server, can allow granular recovery of individual transactions between backup intervals, depending upon the administrator’s preferences Module 9: Data Storage Considerations 211 RAID Technologies Slide Objective To explain RAID technologies and how databases use them " RAID " RAID " Hardware versus software implementation Lead-in RAID technologies can offer advantages when implemented for a data store A Redundant Array of Independent Disks (RAID) can protect data against hardware failure, as well as improve performance Different levels of RAID provide different types of protection and performance improvements RAID mirrors data from one drive to a second drive By mirroring the original data drive, you can still continue to work if one of the drives fails because the other disk will service all data requests Because RAID places a copy of the data on the second drive, the total usable space is equal to the size of the single drive being mirrored RAID uses a more complex algorithm to spread data across three or more disks If five disks are configured to use RAID 5, five virtual RAID stripes are created on each of the five disks One stripe on each disk is reserved for parity information that is used to recover the data in the event of a disk failure; therefore, data is only written to four of the five drives The parity information is used to reconstruct the missing data Because the data is written in a stair-step fashion, the parity stripe is continually distributed over the total number of disks in the RAID stripe This process helps protect the data in the RAID stripe, regardless of which drive in the stripe set fails The effective size of a RAID stripe is approximately the total size of all drives in the RAID stripe, minus the capacity of one drive for the parity information Another consideration is how the RAID technology will be implemented Microsoft Windows NT® 4.0 and Windows 2000 both have support for most RAID software implementations With software RAID, the operating system implements the mirror or generates the parity information Software RAID, however, requires valuable processor time when transferring data to and from the hard disks RAID technology might be most appropriately implemented in hardware solutions When using a hardware RAID implementation, the dedicated RAID controllers handle the processing necessary for mirroring and computing parity This scenario does not detrimentally affect the system’s CPU processing power 212 Module 9: Data Storage Considerations Note You should keep in mind that RAID also provides performance enhancements over a single disk In a RAID implementation, the data can be read from multiple disks at once, which greatly improves performance Windows NT Workstation 4.0 and Windows NT Server 4.0 provide RAID with software-based support, but only Windows NT Server provides software support for RAID Both Windows NT Workstation and Windows NT Server provide support for hardware RAID controllers that can provide RAID 1, RAID 5, and additional RAID support capabilities Thus, any data store using Windows NT can take advantage of these RAID disk technologies Module 9: Data Storage Considerations 213 Clustering Slide Objective To explain a solution that can provide increased uptime for systems that host data stores Lead-in " Shared disk cluster " Shared nothing cluster Availability of databases can be very important Clustering technologies can increase a database’s availability A cluster consists of two or more independent computers that are connected and behave as a single system Clusters extend the idea of multiple processors within a single computer to multiple computers that work together Clustering can be used to enhance the performance of applications or to provide online redundancy if one system of the cluster fails In a shared disk cluster, all servers have direct access to all disks and data, but not share processors or memory Shared disk clusters allow a large database to be placed on a large shared disk array If one of the systems fails, another system can provide real-time and online fail-over support, taking over the role of processing database requests This means of protection brings a great deal of fault tolerance to the system In a shared nothing cluster, each server has its own memory, processor, and disks If one of the systems fails, another system can take over its work and provide fault tolerance The data must be automatically replicated to all storage disks in a shared nothing cluster The clustering technology supplied by the Windows operating system can easily provide a shared disk cluster from which file-based data stores such as Excel and Jet can gain benefits These clustering technologies are currently available with Windows NT Server 4.0 Enterprise Edition To provide additional clustering capabilities, SQL Server 6.5 Enterprise Edition and SQL Server 7.0 are also cluster-aware, and thus provide significant clustering capabilities when paired with Windows NT Server 4.0 Enterprise Edition 214 Module 9: Data Storage Considerations Disaster Recovery Slide Objective To explain disaster recovery and to differentiate it from backup and restore considerations " " Lead-in Planning Assessment Disaster recovery involves not only restoring data, but also recovering from a complete system failure Disaster recovery is more than simply backing up and restoring data A disaster could be as minor as a single disk failing in a RAID disk array or as major as a fire or earthquake destroying an entire data center Because disasters are not predictable, planning for future occurrences is critical Disaster recovery involves planning for disasters that affect more than simply the data or database and for failed hardware or a collapsed infrastructure For example, if a database that must be available for customers to order products over the Web is suddenly unavailable, then every minute of downtime results in lost revenue If an entire server or groups of servers becomes unavailable, you should expect to have to more than simply restore the database You should plan for replacing hardware and software and rebuilding the missing infrastructure You should ask the following questions when devising a disaster recovery plan: " Where does replacement hardware originate? How fast can you get replacements? " How are your systems configured? Do you have documentation of your current environment? " Should system replacements be preconfigured and stored offsite? " Should an organization that specializes in disaster recovery be involved? Do they have references? " Who should be told about a disaster? Do you have a copy of an organizational chart showing possible contacts and how to reach them in case of an emergency? " What are the steps for recovering from a disaster? Do you have the hardware, software, licensing, and written documentation offsite? Module 9: Data Storage Considerations 215 To ensure a good disaster recovery plan, you should periodically assess your plan For practice, you should actually put the plan into action in a simulated disaster Many large companies already conduct periodic tests to ensure that their disaster recovery plans work as expected Testing the plan is the only assurance of a working disaster recovery plan Note Resources on Microsoft’s Web site and Microsoft Technet list items to consider specifically for SQL Server These resources are a good starting place for your own disaster recovery plan 216 Module 9: Data Storage Considerations ! Activity 9.1: Data Quiz Slide Objective To introduce this activity Lead-in In this activity, you will discuss hardware and software considerations for the Ferguson and Bardell, Inc case study In this activity, you will answer a series of questions pertaining to selection of the appropriate software and hardware combinations for the Ferguson and Bardell, Inc system After completing this activity, you will be able to: " Identify the data characteristics and technologies that are most appropriate for use in data-centric applications Module 9: Data Storage Considerations ! Review Slide Objective To explain the purpose of this section and what students have learned Lead-in This section reviews what you have learned in this module " Guidelines " Review Questions " Looking Forward In this section In this section 217 218 Module 9: Data Storage Considerations Guidelines Slide Objective To present some general guidelines related to the information in this module " Look closely at your organization’s needs when choosing an appropriate technology " Periodically test backup and restore procedures as well as disaster recovery plans " Implement hardware RAID for performance improvements Lead-in The following are some general guidelines to consider As with any technology decision, it is important to select the best database product for your specific situation A small solution might be most effectively implemented by using Jet, whereas a larger solution might require SQL Server You should remember to plan for growth when choosing a database product Backup and restore procedures, as well as disaster recovery plans, should be tested periodically to ensure that they perform as expected Possible problems when backing up data should be anticipated and discovered during testing—not during an emergency Of the two types of RAID systems, hardware RAID provides vast improvements in performance for database systems over software RAID You should evaluate the types of transactions (such as simple reads, simple writes, or data mining) that the database must handle and base your RAID decision on these requirements Different types of RAID provide various levels of performance improvements Module 9: Data Storage Considerations 219 Review Questions Slide Objective To reinforce this module’s objectives by reviewing key points " Identify different types of hardware and software for implementing a data store " Choose the appropriate hardware or software for implementing a data store Lead-in These review questions cover some of the key concepts taught in this module What product is most appropriate for a database that will be used concurrently by 500 users and grow as large as 40 GB? SQL Server is the only data store to meet these concurrent use and storage capabilities Jet and Visual FoxPro data engines not adequately support more than 30 concurrent users Jet cannot support databases larger than GB, and Visual FoxPro’s data engine can support only GB per table MSDE is optimized for five concurrent users Which is usually more important: time to back up or time to restore? Time to restore is usually more important because restores typically occur when a system is unavailable Backups can be performed while the system is in use What is the minimum number of disks required when implementing RAID 5? When should you avoid software RAID? If performance on the server is critical, you should avoid software RAID Software RAID uses the system processor to compute parity bits and can hamper overall system performance Hardware RAID controllers, on the other hand, provide their own processing and memory and are thus more appropriate for processor and memoryintensive applications 220 Module 9: Data Storage Considerations Looking Forward Slide Objective To explain what students have accomplished in this course and what they should next to implement a solution " $ DBMS system selection $ " Full database schema $ Lead-in After you have finished the database design for your solution, you will continue the solution development process Database design completed Hardware selection Next steps $ Database implementation $ Application implementation ... Design Choosing a Database Product Module 9: Data Storage Considerations Activity 9.1: Data Quiz Data Storage Considerations Data Storage Technologies Module 9: Data Storage Considerations 195... this module? ??s topics and objectives " " Data Storage Considerations " Activity 9.1: Data Quiz " In this module, you will look at data storage considerations and data storage technologies Data Storage. .. BLANK Module 9: Data Storage Considerations Module 1: Course Overview Module 8: Designing Data Services Module 7: Implementing Data Integrity Module 2: Solution Design Processes Designing Data

Ngày đăng: 21/12/2013, 06:18

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan