0

information retrieval from the web

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Extraction and Approximation of Numerical Attributes from the Web" pdf

Báo cáo khoa học

... each kind. These patterns are the onlyattribute-specific resource in our framework.Value extraction. The first pattern group,Pvalues, allows extraction of the attribute values from the Web. All ... width 1.695m]’). We then extract new pat-terns from the retrieved search engine snippets andre-query the Web with the new patterns to obtainmore attribute values.We provided the framework with ... value for the givenobject. During the first stage it is possible thatwe directly extract from the text a set of valuesfor the requested object. The bounds processingstep rejects some of these...
  • 10
  • 465
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx

Báo cáo khoa học

... query is a term, its hitis the number of pages that contain the term on the Web. We use the following notation.H(x)= the number of pages that contain the term x” The number H (x) can be used ... half(Evaluation II) in Table 2 shows the result.S: the target term was collected by the system.F: the target term was removed in the filtering step.A: the target term existed in the compiled corpus,but ... automatic term extrac-tion.C: the target term existed in the collected web pages, but did not exist in the compiled corpus.R: the target term did not exist on the collected web pages.Only 43 terms...
  • 4
  • 437
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Báo cáo khoa học

... that, using the new web mining scheme, the web mining throughput is increased by 32%; (ii) The quality of the mined data is improved. By lever-aging the web pages’ HTML structures, the sen-tence ... English-Chinese parallel data from the web. The mining procedure is initiated by acquiring Chinese website list. We have downloaded about 300,000 URLs of Chinese websites from the web directories at ... (1) Given a web site, the root page and web pages directly linked from the root page are downloaded. Then for each of the downloaded web page, all of its anchor texts (i.e. the hyperlinked...
  • 8
  • 435
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Acquisition of Ranked Qualia Structures from the Web" potx

Báo cáo khoa học

... coefficient (Web- Jac), the PointwiseMutual Information (Web- PMI) and the conditionalprobability (Web- P). We also present a version of the conditional probability which does not use the Web but merely ... (not calculated over the Web) as well as the conditional probability cal-culated over the Web (Web- P) delivered the best re-sults, while the PMI-based ranking measure yielded the worst results. ... appropriatequeries to the web search engine and choosing the article leading to the highest number of results. The corresponding patterns are then matched in the 50snippets returned by the search engine...
  • 8
  • 378
  • 0
Báo cáo khoa học:

Báo cáo khoa học: " Paraphrase Generation and Information Retrieval from Stored Text" pdf

Báo cáo khoa học

... (16) below: The dog bit the postman. (16a) The dog bit the postman on the hand. (16b) The dog with fangs bit the postman on the hand. (16c) The relationships which obtain between these sentences ... and if the second keyword falls in the ith sentence from the first keyword, then the third keyword must fall within n-i sentences of either of the previous keywords, or between them. In theory ... This is to say, the longer the segment, the more unelicited information will appear in the response; the shorter the segment, the more elicited information will not appear in the response. It...
  • 11
  • 361
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

Báo cáo khoa học

... our modified version of the competitive link-ing algorithm, the link score of a pair of words is the sum of the φ2 scores of the words themselves, their prefixes and their suffixes. In addition ... pairs, where the translation of the in-parenthesis terms is a suffix of the pre-parenthesis text. The lengths and frequency counts of the suffixes have been used to determine what is the translation ... C ≥ 2 E + K, where C is the length of the Chinese text, E is the length of the English text in the parentheses and K is a constant (we used K=6 in our experiments). The lengths C and E are...
  • 9
  • 612
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

Báo cáo khoa học

... hyponym patterns toextract class instances from the web and then evalu-ates them further by computing mutual information scores based on web queries. The work by (Widdows and Dorow, 2002) on lex-ical ... to instantiate the pattern. On the first iteration, the pattern is given to Google as a web query, and new class members are extracted from the retrieved text snippets. We wanted the system to ... progresses. Initially, the seed is the onlytrusted class member and the only vertex in the graph. The bootstrapping process begins by instan-tiating the doubly-anchored pattern with the seedclass...
  • 9
  • 340
  • 0
Tài liệu How to use the Web to look up information on hacking ppt

Tài liệu How to use the Web to look up information on hacking ppt

An ninh - Bảo mật

... to the Web sites listed at the end of this Guide. Not only do they carry archives of these Guides, they carry a lot of other valuable information for the newbie hacker, as well as links to other ... some people take the shortcut into hacking. They get their phriends to give them a bunch of canned break-in programs. Then they try them on one computer after another until they stumble into ... other technical documents from the Web. Besides, the Web stuff is free! <Geek mode off> The most fantastic Web resource for the aspiring geek, er, hacker, is the RFCs. RFC stands for "Request...
  • 5
  • 566
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Báo cáo khoa học

... through the WWW The WWW can be considered as an exemplar lin-guistic resource for decision-making (Grefenstette,1999). In the present study, the WWW is exploitedin order to re-score the set ... aphrasal translation. We follow the same steps as the WWW-based technique, replacing the WWW by atest collection and a retrieval system to index docu-ments of the test collection.NTCIR test ... solve the problem of phrasal translation. The interactive environment setting should optimize the phrasal translation, select best phrasal transla-tion alternatives and facilitate the information...
  • 4
  • 377
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "The Role of Information Retrieval in Answering Complex Questions" ppt

Báo cáo khoa học

... one million news ar-ticles from the Associated Press, the New YorkTimes, and the Xinhua News Agency.3 Document Retrieval Since information retrieval systems supply the ini-tial set of documents ... properties of the document containing the sentence and propertiesof the sentence itself. Regarding the former type,two features come into play: the relevance scoreof the document (from the IR engine) ... Introduction The field of question answering arose from the recognition that the document does not occupy aprivileged position in the space of information ob-jects as the most ideal unit of retrieval. ...
  • 8
  • 442
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Examining the Content Load of Part of Speech Blocks for Information Retrieval" pptx

Báo cáo khoa học

... ad-hoc task of the 1999 Web track, for the WT2G test collection, and the queries 451-500 from the ad-hoc task of the 2000 Web track, for the WT10G test collection,with their respective relevance ... We test these hypothe-ses in the context of Information Retrieval, by syntactically representing queries, andremoving from them content-poor blocks,in line with the aforementioned hypothe-ses. ... in the DFR weighting schemes,where BB2 and PL2 improved the most from ourhypothesised noise reduction in the queries, whileDLH improved the least, is no longer valid. The improvement in retrieval...
  • 8
  • 447
  • 0

Xem thêm

Tìm thêm: hệ việt nam nhật bản và sức hấp dẫn của tiếng nhật tại việt nam xác định các mục tiêu của chương trình xác định các nguyên tắc biên soạn khảo sát chương trình đào tạo của các đơn vị đào tạo tại nhật bản xác định thời lượng học về mặt lí thuyết và thực tế tiến hành xây dựng chương trình đào tạo dành cho đối tượng không chuyên ngữ tại việt nam điều tra đối với đối tượng giảng viên và đối tượng quản lí khảo sát thực tế giảng dạy tiếng nhật không chuyên ngữ tại việt nam khảo sát các chương trình đào tạo theo những bộ giáo trình tiêu biểu nội dung cụ thể cho từng kĩ năng ở từng cấp độ xác định mức độ đáp ứng về văn hoá và chuyên môn trong ct phát huy những thành tựu công nghệ mới nhất được áp dụng vào công tác dạy và học ngoại ngữ mở máy động cơ rôto dây quấn hệ số công suất cosp fi p2 đặc tuyến hiệu suất h fi p2 đặc tuyến mômen quay m fi p2 động cơ điện không đồng bộ một pha sự cần thiết phải đầu tư xây dựng nhà máy thông tin liên lạc và các dịch vụ từ bảng 3 1 ta thấy ngoài hai thành phần chủ yếu và chiếm tỷ lệ cao nhất là tinh bột và cacbonhydrat trong hạt gạo tẻ còn chứa đường cellulose hemicellulose