Web Data Management pdf

488 673 0
Web Data Management pdf

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

[...]... subset of web tuples from a web table Web project removes some of the nodes from the web tuples in a web table The web distinct operator removes duplicate web tuples from a web bag A user may wish to view web tuples in a different framework In Chapter 9, we introduce a set of data visualization operators such as web nest, web unnest, web coalesce, web pack, web unpack, and web sort to add flexibility... a web table as output A set of simple web schemas and web tuples are produced each time a web operator is applied The global web coupling operator extracts web tuples from the Web In particular, portions of the World Wide Web (WWW) are extracted when it is applied to the WWW The web union, web cartesian product, and the web join are binary operators on web tables Web select extracts a subset of web. .. retrieving relevant data from the Web Therefore a different technique is required to populate a data warehouse for Web data without exploiting the capabilities of wrappers Thus it is necessary to use different warehouse data modeling techniques that nullify the necessity of converting Web data formats That is, the data model of the warehouse for Web data should support representation of Web data in their native... heterogeneous structured data We present the mechanism for generating a web schema in the context of a web warehouse In Chapter 8, we focus on how web tables are generated and are further manipulated in the web warehouse by a set of web algebraic operators The web algebra provides a formal foundation for data representation and manipulation for the web warehouse Each web operator accepts one or two web tables as... the limitations of Web data discussed in Section 1.1.1 Furthermore, similar to a conventional data warehouse, accessing data at a web warehouse does not incur costs that may be associated with accessing data from the Web The web warehouse may also provide access to data when they are not available directly from the Web (Document not found error) Availability, speed of access, and data quality tend to... relevant changes using web algebraic operators such as web join and outer web join Web join is used to detect identical documents residing in two web tables, whereas outer web join, a derivative of web join, is used to identify dangling web tuples We discuss how to represent these changes using delta web tables We have designed and discussed formal algorithms for the generation of delta web tables In Chapter... of traditional data warehousing techniques We believe that a special data warehouse design for Web data (a web warehouse) [116, 13, 140] is necessary to address the needs of Web users to support decision making In this context, we briefly introduce the notion of a web warehousing system In the next section, we discuss the architecture of a web warehouse Similar to a data warehouse, a web warehouse is... to improve Web information management Specifically, we discuss techniques for storing and manipulating Web data in a warehousing environment We begin motivating the need for new warehousing techniques by describing the limitations of Web data and how conventional data warehousing techniques are ill-equipped to manage heterogeneous, autonomous Web data Within this context, we describe how a web warehouse... for data warehouses in general, and web warehouses in particular The last issue is a particularly hard problem Web data quality is vital to properly managing a web warehouse environment The quality of data will limit the ability of the end users to make informed decisions Data quality problems usually occur in one of two places: when data are retrieved and loaded into the web warehouse, or when the Web. .. Manager Web Manipulator Web Warehouse Web Delta Manager Web Miner Analysis Querying Data Mining FRONT END TOOL Fig 1.1 Architecture of Whoweda Coupling Engine The coupling engine is responsible for extracting relevant data from multiple Web sites and storing them in the web warehouse in the form of web tables In essence, it translates information from the native format of the sources into the format and data . DerivativesofWebJoin 338 8.8.1 σ -Web Join 338 8.8.2 OuterWebJoin 344 8.9 WebUnion 350 8.10 Summary 351 9 Web Data Visualization 353 9.1 WebDataVisualizationOperators. Chapter 9, we introduce a set of data visualization operators such as web nest, web unnest, web coalesce, web pack, web unpack, and web sort to add flexibility

Ngày đăng: 22/03/2014, 17:20

Mục lục

  • Contents

  • Preface

  • 1 Introduction

    • 1.1 Motivation

      • 1.1.1 Problems with Web Data

      • 1.1.2 Limitations of Search Engines

      • 1.1.3 Limitations of Traditional Data Warehouse

      • 1.1.4 Warehousing the Web

      • 1.2 Architecture and Functionalities

        • 1.2.1 Scope of This Book

        • 1.3 Research Issues

        • 1.4 Contributions of the Book

        • 2 A Survey of Web Data Management Systems

          • 2.1 Web Query Systems

            • 2.1.1 Search Engines

            • 2.1.2 Metasearch Engines

            • 2.1.3 W3QS

            • 2.1.4 WebSQL

            • 2.1.5 WebLog

            • 2.1.6 NetQL

            • 2.1.7 FLORID

            • 2.1.8 RAW

            • 2.2 Web Information Integration Systems

              • 2.2.1 Information Manifold

              • 2.2.2 TSIMMIS

              • 2.2.3 Ariadne

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan