Retrieving questions and answers in community based question answering services

156 217 0
Retrieving questions and answers in community based question answering services

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

RETRIEVING QUESTIONS AND ANSWERS IN COMMUNITY-BASED QUESTION ANSWERING SERVICES KAI WANG (B.ENG, NANYANG TECHNOLOGICAL UNIVERSITY) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE 2011 Acknowledgments This dissertation would not have been possible without the support and guidance of many people who contributed and extended their valuable assistance in the preparation and completion of this study First and foremost, I would like to express my deepest gratitude to my advisor, Prof Tat-Seng Chua, who led me through the four years of PH.D study and research His perpetual enthusiasm, valuable insights, and unconventional vision in research had consistently motivated me to explore my work in the area of information retrieval He offered me not only invaluable academic guidance but also endless patience and care throughout my daily life As an exemplary mentor, his influence has been undoubtedly beyond the research aspect of my life I am also grateful to my thesis committee members Min-Yan Kan, Wee-Sun Lee and external examiners for their critical readings and giving constructive criticisms so as to make the thesis as sound as possible The members of Lab for Media Search have contributed immensely to my personal and professional during my PH.D pursuit Many thanks also go to Hadi Amiri, Jianxing Yu, Zhaoyan Ming, Chao Zhang, Xia Hu, Chao Zhou for their stimulating discussions and enlightening suggestions on my work Last but not least, I wish to thank my entire extended family, especially my wife Le Jin, for their unflagging love and unfailing support throughout my life My gratitude towards them is truly beyond words i Table of Contents CHAPTER INTRODUCTION 1.1 Background 1.2 Motivation 1.3 Challenges 1.4 Strategies 1.5 Contributions 11 1.6 Guide to This Thesis 12 CHAPTER 2.1 LITERATURE REVIEW Evolution of Question Answering 14 2.1.1 2.1.2 2.2 TREC-based Question Answering 14 Community-based Question Answering 17 Question Retrieval Models 20 2.2.1 2.2.2 2.3 FAQ Retrieval 20 Social QA Retrieval 22 Segmentation Models 25 2.3.1 2.3.2 2.4 Lexical Cohesion 25 Other Methods 27 Related Work 29 2.4.1 Previous Work on QA Retrieval 30 2.4.2 Boundary Detection for Segmentation 33 CHAPTER SYNTACTIC TREE MATCHING 3.1 Overview 37 3.2 Background on Tree Kernel 38 3.3 Syntactic Tree Matching 40 3.3.1 Weighting Scheme of Tree Fragments 41 3.3.2 Measuring Node Matching Score 43 i 3.3.3 Similarity Metrics 44 3.3.4 Robustness 45 3.4 Semantic-smoothed Matching 46 3.5 Experiments 49 3.5.1 3.5.2 Retrieval Model 51 3.5.3 Performance Evaluation 52 3.5.4 Performance Variations to Grammatical Errors 55 3.5.5 3.6 Dataset 50 Error Analysis 57 Summary 58 CHAPTER QUESTION SEGMENTATION 4.1 Overview 60 4.2 Question Sentence Detection 63 4.2.1 4.2.2 Syntactic Shallow Pattern Mining 65 4.2.3 4.3 Sequential Pattern Mining 64 Learning the Classification Model 69 Multi-Sentence Question Segmentation 71 4.3.1 4.3.2 Propagating the Closeness Scores 76 4.3.3 4.4 Building Graphs for Question Threads 72 Segmentation-aided Retrieval 79 Experiments 81 4.4.1 4.4.2 Question Segmentation Accuracy 85 4.4.3 Direct Assessment via User Study 86 4.4.4 4.5 Evaluation of Question Detection 81 Evaluation on Question Retrieval with Segmentation Model 88 Summary 93 CHAPTER ANSWER SEGMENTATION 5.1 Overview 94 5.2 Multi-Sentence Answer Segmentation 100 5.2.1 5.2.2 Score Propagation 108 5.2.3 5.3 Building Graphs for Question-Answer Pairs 100 Question Retrieval with Answer Segmentation 111 Experiments 113 ii 5.3.1 5.3.2 Answer Segmentation Evaluation via User Studies 115 5.3.3 5.4 Answer Segmentation Accuracy 113 Question Retrieval Performance with Answer Segmentation 117 Summary 123 CHAPTER 6.1 CONCLUSION Contributions 125 6.1.1 Syntactic Tree Matching 125 6.1.2 Segmentation on Multi-sentence Questions and Answers 126 6.1.3 Integrated Community-based Question Answering System 127 6.2 Limitations of This Work 127 6.3 Recommendation 130 BIBLIOGRAPHY 133 APPENDICES A Proof of Recursive Function M(r1,r2) 144 B The Selected List of Web Short-form Text 145 PUBLICATIONS 146 iii List of Tables Table 3.1: Statistics of dataset collected from Yahoo! Answers 51 Table 3.2: Example query questions from testing set 53 Table 3.3 MAP Performance on Different System Combinations and Top Precision Retrieval Results 53 Table 4.1: Number of lexical and syntactic patterns mined over different support and confidence values 82 Table 4.2: Question detection performance over different sets of lexical patterns and syntactic patterns 83 Table 4.3 Examples for sequential and syntactic patterns 84 Table 4.4: Performance comparisons for question detection on different system combinations 85 Table 4.5: Segmentation accuracy on different numbers of sub-questions 86 Table 4.6: Performance of different systems measured by MAP, MRR, and P@1 (%chg shows the improvement as compared to BoW or STM baselines All measures achieve statistically significant improvement with t-test, p-value

Ngày đăng: 10/09/2015, 15:50

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan