infocom16 slide viral marketing

28 153 1
infocom16 slide viral marketing

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 151 2 3 4 5 6 7 8 9 10 11 12 13 14 15

IEEE International Conference on Computer Communications 10-15 April 2016 San Francisco, CA, USA Targeted Viral Marketing in Billion-scale Networks Hung T Nguyen1, My T Thai2 and Thang N Dinh1 CS Dept., Virginia Commonwealth University, Richmond, VA 23284 2CISE Dept., University of Florida, Gainesville, FL 32611 Thang N Dinh tndinh@vcu.edu I Introduction: Viral MarketingMarketing via the “word-of-mouth” effect  Influence Maximization: Find a small set of users(seed) to influence most of the network Thang N Dinh tndinh@vcu.edu Intro.: Viral Marketing Examples VIRAL MARKETING  ALS Ice Bucket Challenge o 2.4 M videos uploaded on Facebook o $98.2 M donated to ALS association  ToyRUs #PlayItForward o $35.5 donation  Always #LikeAGirl (youtube) ~60 mil views Thang N Dinh tndinh@vcu.edu Intro.: Targeted Viral Marketing What’s wrong with choosing Mr President to advertise Shampoo? Thang N Dinh tndinh@vcu.edu Intro.: Targeted Viral Marketing  Targeted Marketing: Focus on customers with certain traits Age: 18-30, Like: Music Tech hobbyists, Age: 25-50  Targeted Viral Marketing: Seeding strategies to influence customers of certain traits Thang N Dinh tndinh@vcu.edu Targeted Viral Marketing Problem  Real-world data: Social networks Twitter, Stackexchange, etc o Users relationship: Who follows whom? o User attributes: Geo-location, o User-generated contents: Tweets, posts, etc  Targeted Viral Marketing: o Company has a budget B to incentivize users o Hope to trigger large cascade of adoption o Whom to target for “3d printing”, “android”, etc.? Thang N Dinh tndinh@vcu.edu Targeted Viral Marketing (TVM)  Input: Given graph 𝐺 = (𝑉, 𝐸, 𝑤) and a budget B and a propagation model  Each node 𝑢 have a cost 𝑐(𝑢) and a relevant score 𝑏(𝑢)  Output: A seed set of total cost at most B that maximize the expected relevance of the influenced users (influence spread) Thang N Dinh tndinh@vcu.edu Related Work: Influence Maximization 𝟏 (𝟏 − − 𝝐)-approximation with 𝒆 Method Time complexity a probability 𝟏 − 𝒏−𝟏 Note Greedy (KDD’03) 𝑂(𝑘𝑚𝑛𝜖 −3 ) Original greedy CELF (KDD’07) 𝑂(𝑘𝑚𝑛𝜖 −3 ) Lazy-forward, up to 700 times faster than Greedy 𝑂( 𝑚 + 𝑛 ln 𝑛 + ln 2𝑛 𝜖 −2 ) 𝑘 Up to 1000 times faster than CELF IMM (SIGMOD’15) 𝑂( 𝑚 + 𝑛 ln 𝑛 + ln 2𝑛 𝜖 −2 ) 𝑘 Up to 100 times faster TIM/TIM+ SSA/D-SSA (To appear ACM SIGMOD’16) Near-linear time + Up to 1000 times faster Guarantee minimum samples than IMM for InfMax Sub-linear time for dense graph TIM/TIM+ (SIGMOD’14) Thang N Dinh tndinh@vcu.edu Related Work  Nguyen et al JSAC’13: Budgeted influence maximization o Not scalable, not consider users’ relevance  Topic-aware influence: No theoretical guarantees on the quality (Barbieri et al KAIS 2013, Barbieri et al EDBT 2014, Chen et al VLDB 2015) Thang N Dinh tndinh@vcu.edu Cascading Models  Describe the cascading processes  Popular models: o o o o Linear Threshold Independent Cascades (or Bayesian Network) SI/ SIS, SIR, SIRS, SEIRS, … Load shedding, DC/AC Power Flow Models Thang N Dinh tndinh@vcu.edu 10 General Framework RIS sampling max𝑆 ∈Ω 𝑓(𝑆) (𝛼 − 𝜖)-approx solution 𝑆𝒜 𝑓 𝑆𝒜 ≥ 𝛼 − 𝜖 𝑂𝑃𝑇𝑓 Sample generator 𝒯 [size 𝑇 = 𝜃(ϵ, δ)] 𝑓መ𝑇 𝑆 ∼ 𝑓 𝑆 𝑤 ℎ 𝑝 Max-coverage (1-1/e) approx Bounding techniques max𝑆 ∈Ω 𝑓መ𝑇 (𝑆) 𝛼-approx algorithm 𝒜 𝑆𝒜 ∈ Ω 𝑓መ𝑇 𝑆𝒜 ≥ 𝛼 ∙ 𝑂𝑃𝑇𝑓መ𝑇 with prob (1 − δ) Difficult to get (𝛼 − 𝜖)OPT multiplicative error How many samples? 𝜽(𝝐, 𝜹) = ??? How to achieve minimum number of samples??? Thang N Dinh tndinh@vcu.edu 14 RIS Sampling(Borg Et al 14’)  Generate hypergraph ℋ with hyperedges: o Select a random 𝑢 ∈ 𝑉 and a random graph sample 𝑔 o Hyperedge ℰ = { nodes that can reach 𝑢 in 𝑔} • Note: Instead of generating 𝑔, we can use reverse BFS 0.6 a u=a u=b u=c b 0.2 0.3 c Example: Assuming Independent Cascade model ℰ1 = { 𝑎, 𝑏 } ℰ2 = 𝑏, 𝑎, 𝑐 ℰ3 = 𝑐, 𝑎 ℋ = (𝑉, ℰ1 , ℰ2 , ℰ3 ) Thang N Dinh tndinh@vcu.edu 15 RIS Sampling (cont.) 0.6 a 0.2 0.3  Observation: b ℰ1 = { 𝑎, 𝑏 } ℰ2 = 𝑏, 𝑎, 𝑐 ℰ3 = 𝑐, 𝑎 c o Influential nodes appear more often in the hyperedges o Influential seed set = one that covers most hyperedges RIS framework (Borgs et al., Tang et al 2014) Generate multiple hyperedges Find seed set that covers most hyperedges using greedy algorithm for Max-Coverage Thang N Dinh tndinh@vcu.edu 16 Number of Samples (Threshold)  Time complexity (expected) = #Hyperedges [𝒎ℋ ] x (Time to generate a hyperedge) [EPT] Decide the running-time  A - How many hyperedges are sufficient? 𝜃 ≥ 8+𝜖 Unknown in advance 𝑛 ln +ln 2𝑛 𝟏 𝑘 𝑛  [(𝟏 − − 𝝐)-approx 𝒆 𝑂𝑃𝑇𝑘 𝜖 with a probability 𝟏 − 𝒏−𝟏] (Tang et al ‘14)  B- Can we generate just a little than 𝜃 hyperedges? - TIM:Lowerbound OPT by KPT ≤ OPT - TIM+: Lowerbound KPT+ by KPT+ ∈ [KPT, OPT]  Highly sophisticated estimation  No guarantees on the number of samples Thang N Dinh tndinh@vcu.edu 17 BCT Algorithm Thang N Dinh tndinh@vcu.edu 18 BCT Algorithm  Effective stopping conditions to generate “just enough” samples  Importance sampling to guarantee a almost linear number of samples  Provable bounded errors and high confidence Thang N Dinh tndinh@vcu.edu 19 Provable Guarantees Thang N Dinh tndinh@vcu.edu 20 Experiments  Datasets Thang N Dinh tndinh@vcu.edu 21 Results: Benefit comparison BCT results in the the best benefit with the same budget! Thang N Dinh tndinh@vcu.edu 22 Results: Quality & Running time Thang N Dinh tndinh@vcu.edu 23 Results: Running time on Twitter Thang N Dinh tndinh@vcu.edu 24 Seeding Quality  Twitter: 40 million nodes, 1.5 billion edges, 106 millions tweets Thang N Dinh tndinh@vcu.edu 25 Experiment(cont.)  300 times faster TIM+-based method  More practical solutions Thang N Dinh tndinh@vcu.edu 26 Summary  Investigate Targeted Viral Marketing Problem on Real-world data  Scalable algorithm to handle billion-scale networks  Provable performance guarantee with high confidence matching theoretically derived thresholds on the number of samples  Future work: o Dynamic/Correlated Probabilistic Networks o Distributed/parallel and/or GPU-based implemenation Thang N Dinh tndinh@vcu.edu 27 THANK YOU FOR LISTENING! Question & Answer Thang N Dinh tndinh@vcu.edu 28 ... tndinh@vcu.edu Intro.: Targeted Viral Marketing What’s wrong with choosing Mr President to advertise Shampoo? Thang N Dinh tndinh@vcu.edu Intro.: Targeted Viral Marketing  Targeted Marketing: Focus on customers... Introduction: Viral Marketing  Marketing via the “word-of-mouth” effect  Influence Maximization: Find a small set of users(seed) to influence most of the network Thang N Dinh tndinh@vcu.edu Intro.: Viral. .. Tech hobbyists, Age: 25-50  Targeted Viral Marketing: Seeding strategies to influence customers of certain traits Thang N Dinh tndinh@vcu.edu Targeted Viral Marketing Problem  Real-world data:

Ngày đăng: 23/06/2018, 07:40

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan