Modeling the Internet and the Web pdf

306 2.1K 2
Modeling the Internet and the Web pdf

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

[...]... on the Web in Chapter 7 The chapter also deals with the basic principles of Web crawlers Web crawling is essential to gather information about the Web and in this sense is a prerequisite for the study of the Web graph in Chapter 3 Chapter 3 studies the Internet and the Web as large graphs It describes, models, and analyzes the power-law distribution of Web sizes, connectivity, PageRank, and the ‘small-world’... the ‘glue’ of this book Chapter 2 provides an introduction to the Internet and the Web and the foundations of the WWW technologies that are necessary to understand the rest of the book, including the structure of Web documents, the basics of Internet protocols, Web server log files, and so forth Server log files, for instance, are important to thoroughly understand the analysis of human behavior on the. .. such as webs of scientific citations, social relations, or even protein interactions In this sense, it is fair to say that a predominant fraction of our book is about the Web and the information aspects of the Internet We use Web every time we refer to the World Wide Web and web when we refer to a broader class of networks or other kinds of networks, i.e web of citations As the Internet and the Web. .. Mathematics, Economics and Business, and Social Sciences The topic is quite broad On the surface the Web could appear to be a limited subdiscipline of computer science, but in reality it is impossible for a single researcher to have an in-depth knowledge and understanding of all the areas of science and technology touched by the Internet and the Web While we do not claim to cover all aspects of the Internet. .. but they have also themselves become the objects of active scientific investigation And not only for computer scientists and engineers, but also for mathematicians, economists, social scientists, and even biologists There are many reasons why the Internet and the Web are exciting, albeit young, topics for scientific investigation These reasons go beyond the need to improve the underlying technology and. .. tried to respect Internet , in particular, is the more general term and implicitly includes physical aspects of the underlying networks as well as mechanisms such as email and peer-to-peer activities that are not directly associated with the Web The term Web , on the other hand, is associated with the information stored and available on the Internet It is also a term that points to other complex networks... technology and to harness the Web for commercial applications Because the Internet and the Web can be viewed as dynamic constellations of interconnected processors and Web pages, respectively, they can be monitored in many ways and at many different levels of granularity, ranging from packet traffic, to user behavior, to the graphical structure of Web pages and their hyperlinks These measurements provide... was either greater than five billion or not There is, however, considerable uncertainty about what this number was back in January 2003 since, as we will discuss later in Chapters 2 and 3, accurately estimating the size of the Web is a quite challenging problem Consequently there is uncertainty about whether the proposition e is true or not Modeling the Internet and the Web P Baldi, P Frasconi and P... the basic axioms ¯ ¯ of probability The ‘normalization’ constant in the denominator of Equation (1.1) can be calculated by noting that P (D) = P (D | e)P (e) + P (D | e)P (e) It is easy to see ¯ ¯ that P (e | D) depends both on the prior and the likelihood in terms of ‘competing’ with the alternative hypothesis e – the larger they are relative to the prior for e and ¯ ¯ the likelihood for e, then the. .. 1 observations for the first of the two words Since the prior is flat, the posterior Beta distribution has the same shape as the likelihood function Figure 1.3 shows the same inference problem with the same data, but with a different prior – now the prior is ‘stronger’ and favors a parameter π that is around 0.5 In this case the likelihood and the posterior have different shapes and the posterior in effect . alt="" Modeling the Internet and the Web This Page Intentionally Left Blank Modeling the Internet and the Web Probabilistic Methods and Algorithms Pierre Baldi School of Information and Computer. directly associated with the Web. The term Web , on the other hand, is associated with the information stored and available on the Internet. It is also a term that points to other complex networks. part of the ‘glue’ of this book. Chapter 2 provides an introduction to the Internet and the Web and the foundations of the WWW technologies that are necessary to understand the rest of the book, including

Ngày đăng: 31/03/2014, 22:20

Mục lục

  • Modeling the Internet and the Web : Probabilistic Methods and Algorithms

  • 1 Mathematical Background

    • 1.1 Probability and Learning from a Bayesian Perspective

    • 1.2.2 A simple die example

    • 1.3 Mixture Models and the Expectation Maximization Algorithm

    • 1.4.3 Learning directed graphical models from data

    • 1.7.3 Applications to Languages: Zipf's and Heaps' Laws

    • 1.7.4 Origin of power-law distributions and Fermi's model

    • 2.1.2 General structure of an HTML document

    • 2.2 Resource Identifiers: URI, URL, and URN

    • 2.3 Protocols

      • 2.3.1 Reference models and TCP/IP

      • 2.3.2 The domain name system

      • 2.3.3 The Hypertext Transfer Protocol

      • 3.1.4 Power law of PageRank

      • 3.2.2 Lattice perturbation models: between order and disorder

      • 3.2.3 Preferential attachment models, or the rich get richer

      • 3.3.2 Subgraph patterns and communities

      • 3.4 Notes and Additional Technical References

      • 4.2.2 Text conflation and vocabulary reduction

      • 4.3.3 Retrieval and evaluation measures

      • 4.5 Latent Semantic Analysis

        • 4.5.1 LSI and text documents

Tài liệu cùng người dùng

Tài liệu liên quan