Hệ tư vấn dựa trên mức độ quan trọng hàm ý thống kê tt tiếng anh

26 66 0
Hệ tư vấn dựa trên mức độ quan trọng hàm ý thống kê tt tiếng anh

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

UNIVERSITY OF DANANG UNIVERSITY OF SCIENCE AND TECHNOLOGY - - PHAN PHUONG LAN RECOMMENDATION SYSTEMS BASED ON STATISTICAL IMPLICATIVE MEASURES Specialization: Computer Science Code: 9480101 DOCTORAL THESIS SUMMARY Danang – 2019 The dissertation is completed at: UNIVERSITY OF SCIENCE AND TECHNOLOGY UNIVERSITY OF DANANG Academic Instructors: Huynh Xuan Hiep, Assoc Prof., PhD Huynh Huu Hung, PhD Opponent 1:…………………………… …………… Opponent 2:……………… ……… ……………… Opponent 3:……………… …… ………………… The dissertation will be defended before the Board of thesis review Meeting at: ………………………… At hour day month year The dissertation is available at: - National Library - Information and Learning Center, University of Da Nang PREFACE The urgency of the thesis The recommendation system (RS) is considered as one of the effective solutions for the information explosion problem because it can automatically analyze data to predict the ratings of a user for products, services, etc thereby recommending to that user the list of items with the highest predicted ratings The main techniques used to build a RS are: Content-based, collaborative filtering, knowledge-based, and hybrid methods In particular, collaborative filtering is the most important and commonly used technique Proposing and improving the recommendation models to adapt to the diversity of application areas, the difference of user requirements and the development of technology are always the main research direction on RSs Applying the statistical implicative analysis method (SIA) to other research fields is being one of the most interesting topics Not much research links that method to RSs The research still has some unresolved issues: Only focusing on building the models on binary data and not paying the attention to non-binary data; just focusing on the accuracy of the recommended good items when evaluating RSs; using the association rules to make the recommendation, as a result, the recommendation time may be long and the computer may be overloaded; and not noticing the combination among the characteristics of statistical implicative measures to improve the recommendation accuracy Therefore, the PhD thesis "Recommendation systems based on statistical implicative measures" is conducted to contribute a small part to the research field on RSs and SIA 2 Objectives, objects and scope of research of the thesis 2.1 Research objectives The objective of the thesis is to understand and apply the statistical implicative measures and the collaborative filtering technique to propose recommendation models as well as improve the accuracy of proposed models Thereby, the thesis contributes to linking the SIA method to the research on RSs 2.2 Research objects Two main objects of the study are: Statistical implicative measures; and recommendation models based on statistical implicative measures and collaborative filtering technique 2.3 Research scopes The scope of the study is: To obtain the understanding on the statistical implicative measures, collaborative filtering technique, and the existing studies on RSs using the SIA method; and to propose new recommendation models that can be applied on both binary and non-binary data and improve the accuracy of recommendation (the list of good items and the predicted ratings) Research methodology Literature review and experiment are two main research methods to be used by this thesis Contribution of the thesis - Firstly, two new measures developed on statistical implicative measures: (1) k nearest neighbors/users based implicative rating - KnnUIR; and (2) k nearest neighbors/items based implicative rating - KnnIIR These measures are used to predict the ratings given to items by a user - Secondly, three new recommendation models: (1) based on the statistical implicative measures and association rules; (2) based on KnnUIR; and (3) based on KnnIIR The proposed models can be applied on both binary data and non-binary data - Thirdly, the Interestingness software tool including the utility functions and the proposed recommendation models This tool is developed in the R language, and is used for experiment - Fourthly, the DKHP binary dataset storing the course registration DKHP is collected and used for evaluating the accuracy of recommendation Thesis structure The thesis is organized into four chapters and six appendices as the followings Chapter 1: An overview of statistical implicative measures and recommendation systems Chapter 2: Recommendation based on statistical implicative measures and association rules Chapter 3: Recommendation based on users implicative rating measure Chapter 4: Recommendation based on items implicative rating measure Appendices include: (1) Interestingness tool and DKHP dataset; (2) Algorithms used for developing and evaluating the proposed recommendation models; and (3) Some additional experiment scenarios CHAPTER AN OVERVIEW 1.1 Statistical implicative measures 1.1.1 Definition Statistical implicative measures (SIM) are measures proposed by the statistical implicative analysis method SIMs are used to detect trends in a binary attribute set or non-binary attribute set SIMs are asymmetric, probability based and non-linear measures 1.1.2 Statistical implicative measures for binary data 1.1.3 Statistical implicative measures for non-binary data 1.2 Statistical implicative ratings Statistical implicative rating measures is proposed by the thesis using some existing SIMs We can consider these measures as SIMs Statistical implicative rating measures are used to predict the rating of a user for an item; thereby contributing to solving recommendation problems 1.3 Recommendation based on statistical implicative analysis 1.3.1 Recommendation systems and research directions 1.3.2 Collaborative filtering technique 1.3.2.1 Memory based methods 1.3.2.2 Model based methods 1.3.3 Evaluating recommendation systems 1.3.3.1 K-fold cross validation method 1.3.3.2 Classification accuracy metrics 1.3.3.3 Predictive accuracy metrics 1.3.3.4 Rank accuracy metrics 1.3.4 Statistical implicative analysis based recommendation 1.3.4.1 Existing recommendation methods 1.3.4.2 Recommendation based on statistical implicative measures 1.4 Conclusion Chapter focuses on obtaining the understanding on SIMs, RSs and the accuracy metrics used for evaluating RSs The thesis summarizes SIMs (such as implicative intensity, entropic version of implicative intensity, cohesion, contribution) and identify which measures should be used by RSs and to improve the accuracy of recommendation result Besides, Chapter also focuses on the collaborative filtering technique and the accuracy metrics to be used for building and evaluating recommendation models Moreover, Chapter also presents the research directions on RSs as well as the existing research related to RSs based on statistical implicative analysis; then identify the scope of study and sketch the proposal CHAPTER RECOMMENDATION BASED ON STATISTICAL IMPLICATIVE MEASURES AND ASSOCIATION RULES Differing from the existing recommendation models based on the statistical implicative analysis (SIA) and association rules, the proposed model of this chapter: Can be applied on both binary and non-binary data; provides more SIMs (such as implicative intensity, entropic version of implicative intensity, cohesion) to make the recommendation; and enables to combine one of the above measure with the contribution measure to improve the accuracy of RSs 2.1 Statistical implicative rules based model - SIR The statistical implicative rules based model SIR is developed on SIMs and association rules The proposed model SIR is shown in Figure 2.1 This model consists of: - A finite set of users 𝑈 = {𝑢1 , 𝑢2 , … , 𝑢𝑛 } - A finite set of items (e.g products, movies, etc.) 𝐼 = {𝑖1 , 𝑖2 , … , 𝑖𝑚 } - A rating matrix 𝑅 = (𝑟𝑗𝑘 )𝑛x𝑚 where 𝑗 = 𝑛 and 𝑘 = 𝑚 to be used for storing the feedback (ratings) of users on items In binary form, 𝑟𝑗𝑘 = if user 𝑢𝑗 likes the item 𝑖𝑘 and 𝑟𝑗𝑘 = (or 𝑁𝐴) if 𝑢𝑗 does not like/know 𝑖𝑘 In non-binary form, 𝑟𝑗𝑘 ∈ [0,1] if 𝑢𝑗 rates 𝑖𝑘 and 𝑟𝑗𝑘 = 𝑁𝐴 if 𝑢𝑗 does not rate/know 𝑖𝑘 - A vector 𝑅𝑢𝑎 storing the known ratings of the user 𝑢𝑎 who needs the recommendation 𝑅𝑢 = {𝑟𝑢 𝑘 } where 𝑘 = ̅̅̅̅̅̅ 1, 𝑚 ; in 𝑎 𝑎 which, 𝑟𝑢𝑎𝑘 = 𝑁𝐴 if 𝑢𝑎 does not rate 𝑖𝑘 (𝑢𝑎 , I, 𝑅𝑢𝑎 ) (U, I, R) Support threshold s Maximum length of a rule l Confidence threshold c {𝑎 → 𝑏 | 𝑎 ∈ 𝐼𝑘 , 𝑏 ∈ 𝐼, 𝑘 = ̅̅̅̅̅̅̅̅̅ 1, 𝑙 − 1} The ruleset is presented by the statistical implicative analysis method {𝑎 → 𝑏} = {𝑛, 𝑛𝑎 , 𝑛𝑏 , 𝑛𝑎𝑏̅ } Implicative intensity, Entropic version of implicative Cohesion measure Improved model: Combining these ones simultaneously {𝑎 → 𝑏} = {𝑣𝑎,𝑏 } Contribution measure List of good items to be recommended to 𝑢𝑎 Figure 2.1: The statistical implicative rules based model To reduce the recommendation time, the SIR model in Figure 2.1 is improved by combining the follows simultaneously (directly): Generating association rules, presenting those rules by the set of four values {𝑛, 𝑛𝑎 , 𝑛𝑏 , 𝑛𝑎𝑏̅ }, calculating the implicative value of those rules according to a specific SIM We can solve this problem by using and modifying the rchic package 2.2 Operation of the statistical implicative rules based model The operation of SIR model includes two stages: Building the filtered ruleset presented according to the SIA method; and performing the recommendation as shown in Figure 2.2 To reduce the recommendation time, we can pre-built the learning model (offline) i2 … im u1 r11 NA … r1m u2 NA r21 … r2m … … … … … un r11 rn2 … NA Inputs Rating matrix i1 Building model (online/offline) Ratings of user who requires the recommendation ua i1 i2 … im-1 im NA ra2 … ram-1 NA Making recommendation (online) Generating rules Pre-processing data Presenting rules according to SIA Recommending items with the highest implicative values Filtering rules The list of top N items ua {i1, i13,…, im-2} Figure 2.2: The operational diagram of the SIR model 2.3 Experiment 2.3.1 Data and tool Three data sets used for the experiment are MSWeb, MovieLens and DKHP (course registration) In which, MSWeb 10 2.3.2 Evaluating the SIR model on non-binary data - The accuracy of SIR model is the highest when (1) the entropic version of implicative intensity and the contribution measure are combined together and the user does require many recommended items In reality, the user will be confused by a lot of items to be recommended - The accuracy of SIR model is higher than that of POPULAR - a recommendation model based on the most popular items 2.4 Conclusion Chapter proposes the statistical implicative rules based model SIR applied on both binary and non-binary data; and improves the proposed model to reduce the recommendation time The ruleset represented by a set of four values can be pre-built offline and used online when someone needs recommendation The SIR model provides many SIMs and can be expanded by providing other objective interestingness measures The SIR model is coded and integrated in the Interestingnesslab tool The accuracy of SIR model is evaluated: By the classification accuracy metrics such as ROC curve, Precision - Recall curve and F1 measure; on two types of data: Binary (MSWeb, DKHP) and non-binary (MovieLens); according to two groups of scenarios: Internal comparison (using the same SIR model but the different SIMs) and external comparison (the SIR model and some existing recommendation models: AR, POPULAR and IBCF) The experimental results show that the SIR model should: (1) combine the entropic version of implicative intensity with the contribution measure to make the recommendation; (2) be used to build RSs because the accuracy of SIR model is higher than that of compared models 11 CHAPTER RECOMMENDATION BASED ON USERS IMPLICATIVE RATING MEASURE The SIR model of Chapter uses the association rules and SIMs to recommend the list of good items to users When the number of rules is too large, the SIR model and the existing models - also based the SIA and the association rules - have to face some disadvantages: The recommendation time may be long if the learning stage is performed online; and the computer may be overloaded Therefore, the thesis takes attention to the rules with length of to overcome those disadvantages Besides, the rating given by 𝑢𝑎 (a user requires the recommendation) to the item 𝑖 maybe similar to the ratings given to 𝑖 by the nearest users (neighbors) of 𝑢𝑗 Moreover, each item owns the contribution to the relationship of 𝑢𝑎 and his/her nearest user 𝑢𝑗 As a result, the thesis combines the above characteristics to improve the accuracy of recommendation 3.1 KnnUIR Definition The k nearest neighbors (i.e users) based implicative rating measure 𝐾𝑛𝑛𝑈𝐼𝑅 is proposed to predict the rating given by a user 𝑢𝑎 for an item 𝑖 ∈ 𝐼 The purpose of this proposal is to increase the recommendation accuracy 𝐾𝑛𝑛𝑈𝐼𝑅 - defined by (3.1) - is based on: (1) the number of nearest users of 𝑢𝑎 - 𝑘𝑛𝑛 (the nearest neighbors 𝑢𝑗 are identified by the implicative intensities of 𝑢𝑎 and 𝑢𝑗 ); (2) the ratings of item 𝑖 that were rated by those neighbors - 𝑟𝑢𝑗𝑖 ; (3) the typicality of 𝑖 contributing to the relationship of 𝑢𝑎 and 𝑢𝑗 - 𝛾(𝑖, 𝑢𝑎 → 𝑢𝑗 ) The value of 12 𝐾𝑛𝑛𝑈𝐼𝑅(𝑢𝑎 , 𝑖) has to be transformed to the range [0, 1] - the same scale as elements of rating matrix 𝑘𝑛𝑛 𝐾𝑛𝑛𝑈𝐼𝑅(𝑢𝑎 , 𝑖) = ∑ 𝑗=1 𝑟𝑢𝑗𝑖 ∗ 𝛾(𝑖, 𝑢𝑎 → 𝑢𝑗 ) (3.1) 3.2 Users implicative rating based model - UIR The users implicative rating based model UIR is developed by using the proposed KnnUIR measure and the user based collaborative filtering method The UIR model shown in Figure 3.1 has the same components as the SIR model However, this UIR model not only predicts the rating given by a user to an item but also recommends the list of top items to a user (𝑢𝑎 , I, 𝑅𝑢𝑎 ) (U, I, R) Implicative intensity 𝑢𝑎 x U  {𝜑(𝑢𝑎 , 𝑢𝑗 ), 𝑗 = ̅̅̅̅̅̅̅̅ 1, 𝑘𝑛𝑛} K nearest neighbors/users based implicative rating measure (KnnUIR) Reclist={𝑖 |𝑖 ∈ 𝐼, 𝑟𝑢′ 𝑎𝑖 ∈ 𝑇𝑜𝑝𝑁} 𝑢𝑎 x I  𝑅𝑢′ 𝑎 Figure 3.1: The users implicative rating based model 3.3 Operation of the users implicative rating based model The operational diagram of the UIR model is presented in Figure 3.2 13 Ratings of user who requires the recommendation ua i1 i2 … im-1 im NA ra2 … ram-1 NA Rating matrix i1 u1 r11 u2 NA … … un rn1 i2 NA r22 … rn2 … … … … … im r1m r2m … NA Inputs Pre-processing data Presenting the relationship of ua and uj where ujU according to SIA and calculating the implicative intensity of (ua, uj) Preparing for calculating the KnnUIR value Finding the k nearest neighbors of ua Calculating the typicality of i contributing to the relationship (ua, uj) Predicting the rating given by ua for iI using KnnUIR No Recommending Recommend? Yes Recommending items with the highest predicted ratings toua The list of top N items Outputs ua {i1, i13, im-2} Predicted ratings i1 i2 ua r’a1 r’i2 … … im r’am Figure 3.2: The operational diagram of the UIR model 3.4 Experiment 3.4.1 Data and tool The Interestingnesslab tool with the proposed UIR model; the MSWeb, DKHP and MovieLens datasets; the recommenderlab 14 package with existing models (POPULAR, IBCF, AR, UBCF, ALS_Implicit and SVD); and the computers (as described in Section 2.3.1) are also used for the experiment of this chapter 3.4.2 Evaluating the UIR model using the classification accuracy metrics - The accuracy of the proposed UIR model (via Precision Recall curve, ROC curve and the F1 measure) is higher than that of the AR, IBCF and POPULAR models but not much higher than that of the UBCF model - The accuracy of the UIR model is lower than that of the SIR model (Chapter 2) if the user requiring the recommendation is a new user (given = 1), the number of nearest users and the number of good items to be recommended are low 3.4.3 Evaluating the UIR model using the predictive accuracy metrics - The contribution of an item to the relationship of two users increases the recommendation accuracy - The accuracy of the proposed UIR model is higher than that of the UBCF model (i.e the mean absolute error MAE and the root mean squared error RMSE are lowest) if the user requiring the recommendation is not a new user In the opposite case, the accuracy of the UIR model still higher than that of UBCF model if the number of nearest neighbors to be used for predicting ratings is high 3.4.4 Evaluating the UIR model using the rank accuracy metrics The experiment is conducted for the case where the active user rated a few of items and requires a few of recommended 15 items The experimental result shows that the accuracy of the proposed UIR model (via the nDCG metric) is higher than that of the UBCF, ALS_Implicit and SVD models if the knn>=30 3.5 Conclusion Chapter proposes a new measure - called KnnUIR - that predicts a user's rating for an item KnnUIR is developed from two SIMs - the typicality and the implicative intensity KnnUIR incorporates many factors affecting the predicted ratings such as the nearest neighbors, the ratings that were rated by those neighbors, and the contribution of an item to the relationship of user requiring the recommendation and his/her nearest neighbors Besides, Chapter proposes a new recommendation model named UIR - using KnnUIR and the user based collaborative filtering method The accuracy of the proposed UIR model is evaluated by: The classification accuracy metrics (for binary data), the predictive accuracy metrics (for non-binary data) and the rank accuracy metrics (for both binary and non-binary data); the group of internal comparison scenarios (UIR and SIR) and the group of external comparison scenarios (UIR and the existing models: AR, IBCF, POPULAR, ALS_Implicit, UBCF, SVD) Experimental results show that the accuracy of the UIR model: (1) is higher when considering the contribution of items in relationship of a user and his/her neighbor; and (2) is the higher than that of the compared existing models when the number of known ratings of user who needs the recommendation is not too low (i.e that user is not a new user) Moreover, the experimental results also show that the accuracy of UIR model is lower than that of proposed SIR model in the case of new users 16 CHAPTER RECOMMENDATION BASED ON ITEMS IMPLICATIVE RATING MEASURE When predicting the rating given by the user 𝑢𝑎 to the item 𝑖, we consider the items that were rated by 𝑢𝑎 are the potential nearest neighbors of 𝑖 Each nearest neighbor 𝑖𝑗 has the different effect on 𝑖 This value can be measured by the interestingness of relationship (𝑖𝑗 , 𝑖) The confidence measure is used to calculate the strength of relationship using the examples 𝑛𝑖𝑗𝑖 whereas the implicative intensity is used for calculating the surprisingness of relationship using the counter-examples 𝑛𝑖𝑗𝑖̅ If two relationships (𝑖𝑗1 , 𝑖) and (𝑖𝑗2 , 𝑖) have the same confidence value, we use the surprisingness value and otherwise Therefore, these two measures can be combined toghether to clearly distinguish the effect of each neighbor 𝑖𝑗 on 𝑖 Chapter also uses the nearest neighbors as Chapter but its neighbors is the items; is also based on items as Chapter but it just considers the relationship of two items instead of a set of items and one item 4.1 KnnIIR Definition The k nearest neighbors (i.e items) based implicative rating measure 𝐾𝑛𝑛𝐼𝐼𝑅 is proposed to predict the rating given by a user 𝑢𝑎 for an item 𝑖 ∈ 𝐼 ; thereby increasing the recommendation accuracy 𝐾𝑛𝑛𝐼𝐼𝑅 is developed by the ratings of 𝑢𝑎 for items 𝑖𝑗 (𝑖𝑗 can be seen as one of potential nearest neighbors of 𝑖) and the strength of relationship between each neighbor 𝑖𝑗 and the item 𝑖 using the confidence value 𝑐(𝑖𝑗 , 𝑖) and one of SIM values - such as the implicative intensity 𝜑(𝑖𝑗 , 𝑖) or the cohesion value 𝑐𝑜ℎ(𝑖𝑗 , 𝑖) or the entropic version of implicative intensity 𝜙(𝑖𝑗 , 𝑖) 17 As a result, 𝐾𝑛𝑛𝐼𝐼𝑅 not only consideres the examples 𝑛𝑖𝑗 𝑖 of relationship 𝑖𝑗 , 𝑖 but also considers the counter-examples 𝑛𝑖𝑗𝑖̅ of this relationship 𝑘𝑛𝑛 𝐾𝑛𝑛𝐼𝐼𝑅(𝑢𝑎 , 𝑖) = ∑ 𝑗=1 𝑟𝑢𝑎𝑖𝑗 ∗ 𝑣𝑖𝑗 𝑖 (4.1) 𝜑(𝑖𝑗 , 𝑖) ∗ 𝑐(𝑖𝑗 , 𝑖) 𝑣𝑖𝑗 𝑖 = [𝑐𝑜ℎ(𝑖𝑗 , 𝑖) ∗ 𝑐(𝑖𝑗 , 𝑖) (4.2) 𝜙(𝑖𝑗 , 𝑖) ∗ 𝑐(𝑖𝑗 , 𝑖) 4.2 Items implicative rating based model - IIR The items implicative rating based model IIR is shown in Figure 4.1 (U, I, R) (𝑢𝑎 , I, 𝑅𝑢𝑎 ) Confidence measure, Implicative intensity, Entropic version of implicative intensity, Cohesion measure I x I  𝑉 = {𝑣𝑗𝑘 | 𝑗, 𝑘 = ̅̅̅̅̅̅̅̅ 1, 𝑘𝑛𝑛} K nearest neighbors/items based implicative rating measure (KnnIIR) Reclist={𝑖 |𝑖 ∈ 𝐼, 𝑟𝑢′ 𝑎𝑖 ∈ 𝑇𝑜𝑝𝑁} 𝑢𝑎 x I  𝑅𝑢′ 𝑎 Figure 4.1: The items implicative rating based model Similar to the models of Chapter and Chapter 3, the proposed IIR model also has a finite user set, a finite item set, a rating matrix, a vector with the ratings already rated by user requiring the recommendation, and a vector with the predicted ratings Differing from the models of the previous chapters, the IIR model uses the item matrix V to store the values 𝑣𝑗𝑘 to carry 18 out the recommendation Matrix V can be built directly or indirectly In the indirect form, we generate a set of rules (similar to Chapter 2) but only consider rules with length of 2, the thresholds of support and confidence to be 0; then convert this ruleset to the item matrix However, compared to the direct method, this approach can increase the recommendation time as well as depends on the tools used for generating rules Besides, the V matrix can be built online or offline When the number of items and the size of the dataset is large, the recommendation time can be shortened if we pre-build the V matrix (offline) and store it in a file 4.3 Operation of the items implicative rating based model The operational diagram of the IIR model is depicted in Figure 4.2 4.4 Experiment 4.4.1 Data and tool Chapter also uses the datasets and tool used by the SIR and UIR models 4.4.2 Evaluating the IIR model using the classification accuracy metrics - Building the item matrix directly can reduce the recommendation time and does not depend on the tools used for generating rules - The accuracy of IIR model (via Precision - Recall curve, ROC curve and the F1 measure) is the highest when the implicative intensity is used for building the item matrix and knn is the number of items of the dataset 19 - The accuracy of the IIR model is higher than that of the compared recommendation models (AR, POPULAR, IBCF, SIR) when the user requiring the recommendation is not a new user Rating matrix i1 i2 u1 r11 NA u2 NA r21 … … … un r11 rn2 Inputs … … … … … im r1m r2m … NA Ratings of user who requires the recommendation i1 i2 … im-1 im ua NA ra2 … ram-1 NA Building the item matrix with knn neighbors Predicting ratings using KnnIIR Building the item matrix Pre-processing data i1 … im i1 NA … v11 … … … … Filtering the matrix to obtain knn neighbors Outputs The list of top N items ua {i1, i13,…, im-2} Making the recommendation im v1m … NA Recommend? No Yes Recommending items with the highest predicted ratings Predicted ratings i1 i2 ua r’a1 r’a2 … … im r’am Figure 4.2: The operational diagram of the IIR model 4.4.3 Evaluating the IIR model using the predictive accuracy metrics - The accuracy of the IIR model (via MAE and RMSE) is the highest when knn is the number of items of the dataset; and the entropy version of implicative intensity is used for building the 20 item matrix if a user only rated a few items and the cohesion measure otherwise - The accuracy of the IIR model is higher than that of the IBCF model if a user requiring the recommendation already rated many items 4.4.4 Evaluating the IIR model using the rank accuracy metrics The accuracy of IIR model (via nDCG) is higher than that of the IBCF, ALS_Implicit models if the active user rated a few of items and requires a few of recommended items 4.5 Comparing the proposed models If dataset in binary form, the SIR model is suitable for the case in which the active user rated a few of items whereas the IIR model fits for the other cases Besides, if the recommendation time is taken into account, the UIR model can be used instead of the SIR model If the data in non-binary form, the accuracy of UIR model is higher than that of IIR model 4.6 Conclusion Chapter proposes a new measure (named KnnIIR) developed from the relationship of two items to predict ratings; and the IIR model using the proposed measure to recommend a list of good items to a user or predict the rating given by a user to an item The proposed IIR model is improved by building the item matrix directly This reduces the recommendation time and avoid the reliance on the tool used for generating rules The accuracy of IIR model is also evaluated: On both binary and nonbinary data; according to the classification accuracy metrics, the predictive accuracy metrics and the rank accuracy metric The 21 experimental results show that the IIR model should: (1) use the implicative intensity if data in binary form or the combination of the entropic version and the cohesion measure if data in nonbinary form to build the item matrix; (2) be used to build RSs because of the high accuracy In addition, the experimental results also show that: (1) the combination between the confidence value and the implicative value of two items improves the recommendation result; and (2) the accuracy of IIR model is lower than that of the SIR in the case of new user 22 CONCLUSION AND FUTURE WORKS Results of the study - Identifying the statistical implicative measures to be used for RSs; then proposing and improving the recommendation model based on SIMs and association rules to recommend the good items to users - Proposing a new measure KnnUIR based on the nearest users and some SIMs, and then proposing a new recommendation model UIR using this measure The proposed model can predict the ratings given by a user to items and recommend the good items to users - Proposing a new measure KnnIIR based on the nearest items and some SIMs, and then proposing a new recommendation model IIR using the proposed measure - Developing the Interestingness tool in R language used for the experiment - Collecting a binary dataset DKHP storing the information of course registration to be used for evaluating the accuracy of recommendation Future works - Developing a hybrid recommendation model to obtain the advantages of each proposed model - Evaluating the proposed models using other methods to obtain the full evaluation; thereby modifying those models to get the higher accuracy -Combining with methods of deep learning and reinforcement learning to improve the accuracy of proposed models 23 PUBLISHED ARTICLES Lan Phuong Phan, Nghia Quoc Phan, Vinh Cong Phan, Hung Huu Huynh, Hiep Xuan Huynh, and Fabrice Guillet, “Classification of objective interestingness measures”, EAI Endorsed Transactions on Context-Aware Systems and Applications, Vol 3, No 10, pp 1-13, 2016 Lan Phuong Phan, Nghia Quoc Phan, Ky Minh Nguyen, Hung Huu Huynh, Hiep Xuan Huynh, and Fabrice Guillet, “Interestingnesslab: A Framework for Developing and Using Objective Interestingness Measures”, In Proceeding of The International Conference on Advances in Information and Communication Technology, Thai Nguyen, Vietnam, December 12-13, 2016, Springer, pp 302-311, 2017 Lan Phuong Phan, Ky Minh Nguyen, Hiep Xuan Huynh and Huu Hung Huynh.“Association-Based Recommender System using Statistical Implicative Cohesion Measure” In Proceedings of the Eighth International Conference on Knowledge and Systems Engineering (KSE 2016), Ha Noi, Vietnam, October 6-8, 2016, IEEE, pp 144 -149, 2016 Lan Phuong Phan, Huu Hung Huynh, Hiep Xuan Huynh, Régis GRAS “Systeme de recommandation basé sur des mesures implicatives fortes” Dans Actes du 9ème colloque d'Analyse Statistique Implicative (A.S.I.9), Belfort, France, Octobre 4-7, 2017, Université Bourgogne Franche-Comtộ Besanỗon, pp 508-532, 2017 Phan Phng Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, “Hệ tư vấn dựa độ đo cường độ hàm ý trách nhiệm”, Kỷ yếu Hội nghị Quốc gia lần thứ X Nghiên cứu ứng dụng Công nghệ Thông tin năm 2017 (FAIR 2017), Đà Nẵng, Việt Nam, ngày 17-18 tháng năm 2017, Nhà xuất Khoa học tự nhiên Công nghệ, trang 256-274, 2017 Phan Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, “Hệ tư vấn lọc cộng tác dựa độ đo hàm ý thống kê”, Trong Kỷ yếu Hội nghị Quốc gia lần thứ XX Điện tử, Truyền thông Cơng nghệ Thơng tin (REV-ECIT 2017), Tp Hồ Chí Minh, Việt Nam, ngày 14-15 tháng 12 năm 2017, Nhà xuất Khoa học Kỹ thuật, trang 200-205, 2017 24 Lan Phuong Phan, Hung Huu Huynh, and Hiep Xuan Huynh, “User based Recommender Systems using Implicative Rating Measure”, International Journal of Advanced Computer Science and Applications, Vol 8, Iss 11, pp 37-43, 2017 Phan Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, “Hệ tư vấn lai ghép dựa độ đo hàm ý thống kê”, Tạp chí Khoa học Trường Đại học Cần Thơ, Số Chuyên đề Công nghệ Thông tin, trang 25-33, 2017 Lan Phuong Phan, Hung Huu Huynh, and Hiep Xuan Huynh, “Recommendation using Rule based Implicative Rating Measure”, International Journal of Advanced Computer Science and Applications, Vol 9, Iss 4, pp 176-181, 2018 10 Lan Phuong Phan, Hung Huu Huynh, and Hiep Xuan Huynh, “Hybrid Recommendation based on Implicative Rating Measures”, In Proceedings of International Conference on Machiene Learning and Soft Computing, Phu Quoc, Viet Nam, February 2-4, 2018, ACM, pp 50-56, 2018 11 Lan Phuong Phan, Hung Huu Huynh, and Hiep Xuan Huynh, “Implicative Rating-Based Hybrid Recommendation Systems”, International Journal of Machine Learning and Computing, Vol 8, No 3, pp 223-228, June 2018 12 Phan Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, “Hệ tư vấn dựa mục tiếp cận hàm ý thống kê”, Kỷ yếu Hội thảo quốc gia lần thứ XXI: Một số vấn đề chọn lọc Cơng nghệ thơng tin truyền thơng, Thanh Hóa, Việt Nam, ngày 27-28 tháng năm 2018, Nhà xuất Khoa học Kỹ thuật, trang 131-136, 2018 13 Hoang Tan Nguyen, Lan Phuong Phan, Hung Huu Huynh, and Hiep Xuan Huynh, “Improved collaborative filtering recommendations using quantitative implication rules mining in implication field”, In Proceedings of International Conference on Machiene Learning and Soft Computing, Da Lat, Viet Nam, 2019, ACM, 2019 14 Phan Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, “Tư vấn xếp hạng hàm ý thống kê liệu nhị phân”, Tạp chí Khoa học Cơng nghệ - Đại học Đà Nẵng,Vol 17, No 1.1.2019, pp 99-103, 2019 ... Hữu Hưng, Huỳnh Xuân Hiệp, Hệ tư vấn dựa mục tiếp cận hàm ý thống kê , Kỷ yếu Hội thảo quốc gia lần thứ XXI: Một số vấn đề chọn lọc Công nghệ thơng tin truyền thơng, Thanh Hóa, Việt Nam, ngày 27-28... Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, Hệ tư vấn lai ghép dựa độ đo hàm ý thống kê , Tạp chí Khoa học Trường Đại học Cần Thơ, Số Chuyên đề Công nghệ Thông tin, trang 25-33, 2017 Lan Phuong... 2017, Nhà xuất Khoa học tự nhiên Công nghệ, trang 256-274, 2017 Phan Phương Lan, Huỳnh Hữu Hưng, Huỳnh Xuân Hiệp, Hệ tư vấn lọc cộng tác dựa độ đo hàm ý thống kê , Trong Kỷ yếu Hội nghị Quốc gia

Ngày đăng: 05/12/2019, 06:22

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan