Data Analysis Machine Learning and Applications Episode 2 Part 10 docx

Conjoint Analysis for Complex Services Using Clusterwise HB Procedures 437 Table 3. Validity values for the total sample and for the clusters for HB estimation (“in total sample”: HB estimation at the individual total sample level; “in segment”: separate HB estimation at the individual cluster 1 resp. 2 level) Cluster 1 Cluster 2 Total sample (n=79)* (n=82) (n=161)* In Total In In Total In Sample Segment Sample Segment First-choice-hit-rate (using draws, n=10,000) 62.57 % 72.38 % 72.39 % 53.12 % 53.14 % Mean Spearman (using draws, n=10,000) 0.727 0.780 0.778 0.677 0.671 First-choice-hit-rate (using mean draws) 65.22 % 75.95 % 74.68 % 54.88 % 57.32 % Mean Spearman (using mean draws) 0.748 0.802 0.797 0.696 0.700 * . . . one respondent had missing holdout data and could not be considered considered. Furthermore we were interested whether clusterwise estimation can op- timize the “results” of HB estimation. A clear answer is not possible up to now. In our empirical investigation in some cases we had improvements with respect to the validity values (cluster 2) and in some cases not (cluster 1). This means that our proposition in the paper can help to reduce the problems that occur when service preference measurement via conjoint analysis is the research focus. HB estimation seems to improve validity even in case of complex services with immaterial attributes and levels that cause perceptual uncertainty and preference heterogeneity. However, going further with the more complicated way of performing clusterwise HB estimation doesn’t provide automatically better results. Nevertheless, further comparisons with larger sample sizes and other research ob- jects are necessary. Furthermore, the possibilities of other validity criteria for clearer statements could be used. References ALLENBY, G.M. and GINTER, J.L. (1995): Using Extremes to Design Products and Segment Markets. Journal of Marketing Research, 32, November, 392–403. ALLENBY, G.M., ARORA, N. and GINTER, J.L (1995): Incorporating Prior Knowledge into the Analysis of Conjoint Studies. Journal of Marketing Research, 32, May, 152–162. ANDREWS, R.L., ANSARI, A. and CURRIM, I.S. (2002): Hierarchical Bayes Versus Fi- nite Mixture Conjoint Analysis Models: A Comparison of Fit, Prediction, and Partworth Recovery. Journal of Marketing Research, 39, February, 87–98. BAIER, D. and GAUL, W. (1999): Optimal Product Positioning Based on Paired Comparison Data. Journal of Econometrics, 89, Nos. 1-2, 365–392. 438 Michael Brusch and Daniel Baier BAIER, D. and GAUL, W. (2003): Market Simulation Using a Probabilistic Ideal Vector Model for Conjoint Data. In: A. Gustafsson, A. Herrmann, and F. Huber (Eds.): Con- joint Measurement - Methods and Applications. Springer, Berlin, 97–120. BAIER, D. and POLASEK, W. (2003): Market Simulation Using Bayesian Procedures in Conjoint Analysis. In: M. Schwaiger and O. Opitz (Eds.): Exploratory Data Analysis in Empirical Research. Springer, Berlin, 413–421. BRUSCH, M., BAIER, D. and TREPPA, A. (2002): Conjoint Analysis and Stimulus Presen- tation - a Comparison of Alternative Methods. In: K. Jajuga, A. Sokođowski and H.H. Bock (Eds.): Classification, Clustering, and Analysis. Springer, Berlin, 203–210. ERNST, O. and SATTLER, H. (2000): Multimediale versus traditionelle Conjoint-Analysen. Ein empirischer Vergleich alternativer Produktpräsentationsformen. Marketing ZFP, 2, 161–172. GREEN, P.E. and SRINIVASAN, V. (1978): Conjoint Analysis in Consumer Research: Issues and Outlook. Journal of Consumer Research, 5, September, 103–123. GREEN, P.E., KRIEGER, A.M. and WIND, Y. (2001): Thirty Years of Conjoint Analysis: Reflections and Prospects. Interfaces 31, 3, part 2, S56–S73. LENK, P.J., DESARBO, W.S., GREEN, P.E. and YOUNG, M.R. (1996): Hierarchical Bayes Conjoint Analysis: Recovery of Partworth Heterogeneity from Reduced Experimental Designs. Marketing Science, 15, 2, 173–191. LIECHTY, J.C., FONG, D.K.H. and DESARBO, W.S. (2005): Dynamic models incorporating individual heterogeneity. Utility evolution in conjoint analysis. Marketing Science, 24, 285–293. ORME, B. (2000): Hierarchical Bayes: Why All the Attention? Quirk’s Marketing Research Review, March. SAWTOOTH SOFTWARE (2002): ACA System. Adaptive Conjoint Analysis Version 5.0. Technical Paper Series, Sawtooth Software. SAWTOOTH SOFTWARE (2006): The ACA/Hierarchical Bayes v3.0 Technical Paper. Tech- nical Paper Series, Sawtooth Software. SENTIS, K. and LI, L. (2002): One Size Fits All or Custom Tailored: Which HB Fits Better? Proceedings of the Sawtooth Software Conference September 2001, 167–175. ZEITHAML, V.A., PARASURAMAN, A. and BERRY, L.L. (1985): Problems and Strategies in Services Marketing. Journal of Marketing, 49, 33–46. Heterogeneity in the Satisfaction-Retention Relationship – A Finite-mixture Approach Dorian Quint and Marcel Paulssen Humboldt-Universität zu Berlin, Institut für Industrielles Marketing-Management, Spandauer Str. 1, 10178 Berlin, Germany dorian.quint@inbox.com , paulssen@wiwi.hu-berlin.de Abstract. Despite the claim that satisfaction ratings are linked to actual repurchase behavior, the number of studies that actually relate satisfaction ratings to actual repurchase behavior is limited (Mittal and Kamakura 2001). Furthermore, in those studies that investigate the satisfaction-retention link customers have repeatedly been shown to defect even though they statetobehighlysatisfied. In a dramatic illustration of the problem Reichheld (1996) reports that while around 90% of industry customers report to be satisfied or even very satisfied, only between 30% to 40% actually repurchase. In this contribution, the relationship between satisfaction and retention was examined using a sample of 1493 business clients in the market of light transporters of a major European market. To examine heterogeneity in the satisfaction- relationship, a finite-mixture approach was chosen to model a mixed logistic regression. The subgroups found by the algorithm do differ with respect to the relationship between satisfaction and loyalty, as well as with respect to the exogenous variables. The resulting model allows us to shed more light on the role of the numerous moderating and interacting variables on the satisfaction-loyalty link in a business-to-business context. 1 Introduction It has been one of the fundamental assumptions of relationship marketing theory that customer satisfaction has a positive impact on retention 1 . Satisfaction was supposed to be the only necessary and sufficient condition for attitudinal loyalty (stated repurchase behavior) and the more manifest retention (actual repurchase behavior) and has been used as an indicator for future profits (Reichheld 1996, Bolton 1998). However, this seemingly undisputed relationship could not be fully confirmed by empirical studies (Gremler and Brown 1996). Further research points out that there can be a large gap between one-time satisfaction and repurchase behavior. Not always leads an intention to repurchase (i.e. the statement in a questionnaire) to an actual repurchase and continuous repurchasing might exist without satisfaction because of mere price settings (see Söderlund and Vilgon 1999, Morwitz 1997). What is more, only 1 I.e. Anderson et al. (2004), Bolton (1998), Söderlund and Vilgon (1999). 472 Dorian Quint and Marcel Paulssen a small number of studies has actually examined repurchase behavior instead of the easier to get repurchase intentions (Bolton 1998, Mittal and Kamakura 2001, Rust and Zahorik 1993). The tenor of these studies is that the link between satisfaction and retention is clearly weaker than the link between satisfaction and loyalty. Many other factors were discovered to have an influence on retention. Also more technical issues like common method variance, mere measurement effects or simply unclear definitions added to raise doubt on the importance and the exact magnitude of the contribution of satisfaction (Reichheld 1996, Söderlund/Vilgon 1999, Giese/Cote 2000). Another reason for the weak relationship between satisfaction and retention is that it may not be a simple linear one, but one moderated by several different variables. Several studies have already studied the effect of moderating variables on the satisfaction-loyalty link (e.g. Homburg and Giering 2001). However, the great majority of empirical studies in this field measured repurchase intentions instead of objective repurchase behavior (Seiders et al. 2005). Thus, the conclusion from prior work is that considerable heterogeneity is present that might explain the often surprisingly weak overall relationship. An important contribution has been put forth by Mittal and Kamakura (2001). They combined the concepts of response biases and different thresholds 2 into their model to capture individual differences between respondents. Based on their results they created a customer group where repurchase behavior was completely unrelated to levels of stated satisfaction. However, their approach fails to identify real existing groups that have a distinctive relationship between satisfaction and retention. For ex- ample, if model results show that older people have a lower threshold and thus repurchase with a higher probability given a certain level of satisfaction, this is not the full story. Other factors, measured or unmeasured, might set off the age effect. In order to find groups with distinctive relationships between satisfaction and retention, we have explicitly chosen a finite-mixture 3 approach, which results in a mixed-logistic regression setup. This model type basically consists of G logistic regressions – one for each latent group. This way, each case i is assigned to a group with a unique relation between the two constructs of interest. However, in a Bernoulli case like this (see McLachlan and Peel 2000, p.163ff), identifiability is not given. The necessary and sufficient condition for identifiability is G max ≤ 1 2 (m + 1), where m is the number of Bernoulli trials. For m = 1 no ML-regression can be estimated. But Foll- mann and Lambert (1991) prove theoretical identifiability of a special case of binary ML-regressions. Only the thresholds O are allowed to vary over the groups, while all remaining regression parameters are equal for all groups. According to Theorem 2 of Follmann and Lambert (1991) theoretical identifiability then depends only on the maximal number of different values of one covariate N max given the values of all other covariates are held constant. The maximal number of components is then given by G max = √ N max + 2 −1. Thus, the theorem restricts the choice of the variables, 2 In our model thresholds are tolerance levels and can be conceived as the probability of repurchase given all other covariates are zero. 3 For an overview on finite-mixture models, see McLachlan and Peel (2002) and the references therein. Heterogeneity in the Satisfaction-Retention Relationship 473 but ultimately helps building a suitable model for the relationship under investigation. In our final model we also included so-called concomitant covariate variables, which help to understand latent class membership and enhance interpretability of each group or class. This is achieved by using a multinomial regression of the latent class variable c on these variables x: P(c gi = 1|x i )= e D g +J  g x i  G l=1 e D l +J  l x i = e D g +J  g x i 1+  G−1 l=1 e D l +J  l x i . (1) Here D is a (G −1)-dimensional vector of logit constants and * a (G −1) × Q matrix of logit coefficients. The last group G serves as a standardizing reference group with D G = 0 and J G = 0. This results in a model of a mixed logistic regression with concomitant variables: P(y i = 1|x i )= G  g=1 P(c gi = 1|x i )P(y i = 1|c gi = 1,x i ), with P(y i = 1|c gi = 1,x i )= e −O g +E g x i 1+ e −O g +E g x i . (2) 2 The Model To analyze the relationship between satisfaction and retention with a ML-regression, data is being used from a major European light truck market in a B2B environment. This data entails all major brands, which makes it possible to identify brand switch- ers and loyal customers. All respondents bought at least one light truck between two and four months before filling in the questionnaire. Out of all respondents who replied to all relevant questions only those were retained who bought the new truck as a replacement for their old one – resulting in 1493 observations. The satisfaction- retention link is now being operationalized in Mplus 4.0 using the response-bias- effect introduced by Mittal and Kamakura (2001), which enables us to use Theorem 2 of Follmann and Lambert (1991). Following Paulssen and Birk (2006) only demographic and by brand moderated demographic response-bias-effects are estimated in our model. The resulting equation for the latent satisfaction in logit is then: sat ∗ i = E 1 sat i + E 2 sat i ∗cons i + E 3 sat i ∗age i + E 4 sat i ∗brand i + E 5 sat i ∗cons i ∗brand i + E 6 sat i ∗age i ∗brand i + H i . The satisfaction-retention link for a latent class g can then be written as 4 : 4 Here age stands for the standardized stated age, cons for consideration set and brand indicates a specific brand. 474 Dorian Quint and Marcel Paulssen P(Retention = y i |c gi = 1,sat,cons,age,brand)=P(sat ∗ i > O g ) = e −O g +sat ∗ 1+ e −O g +sat ∗ . The latent class variable c is being regressed on the concomitant variables using a multinomial regression. As concomitant variables we used: Length of ownership of the replaced van (standardized), Ownership (self-employed 0, company 1), Brand of replaced van (other brands 0, specific "brand 1" 1), Consideration Set of other brands than the owned one (empty 0, at least one other brand 1) and Dealer (not involved in talks 0, involved 1). The model was estimated for several numbers of latent classes, with the theoretical maximum of classes being five. The fit indices for this model series can be found in table 1. All four ML-models possess a better fitthanasimple logistic regression, but show a mixed picture. The AIC allows for a model with four classes and BIC allows for only one. To decide on the number of classes, the adjusted BIC was used, which allows for three classes 5 . This model was estimated using 500 random starting values and 500 iterations as recommended by Muthen and Muthen (2006, p.327). The Log-Likelihood of the chosen model is not reproduced in only nine out of 100 sequences, which, according to Muthen and Muthen (2006, p.325), points clearly toward a global maximum. Table 1. Model Fit criterion Simple LR G = 1 G = 2 G = 3 G = 4 Log-Likelihood -971.280 -928.104 -902.727 -888.346 -873.472 AIC 1946.559 1870.209 1833.454 1814.692 1802.945 BIC 1957.176 1907.369 1907.774 1915.554 1951.584 Adjusted BIC 1950.823 1885.132 1863.300 1855.196 1862.636 Entropy – – 0.531 0.563 0.885 Entropy for the chosen model is 0.563, which indicates modest separation of the classes. As can be clearly seen in table 2, the discriminatory power is mixed with class 2 being well separated (0.821), while classes 1 and 3 are not perfectly separable. Table 2. Miss-classification matrix 123 1 0.762 0.063 0.175 2 0.179 0.821 0.000 3 0.328 0.000 0.672 5 See Nylund et al. 2006. Heterogeneity in the Satisfaction-Retention Relationship 475 The results of this model are shown in table 3. The thresholds of latent classes 2and3werefixed after the first models we used showed extreme values for them, resulting in a probability of repurchase of 0% respectively 100%. This means that for both groups repurchase probability is independent of the values of the covariates. In this way the algorithm eventually works as a filter and puts those respondents who repurchase or do not repurchase independent of their satisfaction into separate groups. Thus, the only unfixed threshold is 3.174 for latent class 1. This class has a weight of 49.4%, while class 2 has 27.7% and class 3 represents 29.9% of the respondents. The estimated value for E 1 is 0.944 and is, like all other coefficients, significant on the 5% level. The value for E 1 represents the main effect of satisfaction with the previous van in case all other covariates are zero. In this case the odds ratio for repurchasing the same brand is increased by e 0.944 = 2.57, which means satisfaction has a positive effect on the odds of staying with the same brand versus buying another brand. The estimates E 2 and E 3 correspond to response-bias-effects in case, the brand is not the specific brand 1. Both estimates are significant, meaning that response bias is present. The interpretation of the beta-coefficients is similar as before in that all other covariates are assumed to be zero. When considering only respondents who had previously a van of brand 1, that is brand = 1, things change. The effect for age, given the consideration set is empty, becomes 0.147 −0.131 = 0.016 almost completely wiping out the influence of response bias. For the covariate consideration set results are analogous: Given a sample-average age the response bias-effect for respondents who replaced a van by brand 1 collapses to −0.244 + 0.254 = 0.01. As to the multinomial logistic regression of the latent class variable c on the concomitant variables, class 3 has been chosen to be the reference class. The constants D g can be used to compute the probabilities of class membership for each respondent, who has an average length of ownership, who are self-employed, had not replaced a van of brand 1, did not consider another brand and who were not involved in talks with the dealer. For this group class membership for class g is e D g /(1 + 6 2 l=1 e D l ) 6 . The probability of class membership in class 1 increases with increasing length of ownership. For low lengths of about one year, probability of membership is highest for class 3. However, probability of membership in class 2 is hardly influenced by the length of ownership. Self-employed respondents have a probability of belonging to class 1 of more than 80% despite the non-significance of the owner variable. The influences on class membership for the other concomitant variables can be explained analogously. This model with three latent classes fits the data better than a simple linear regression of retention on satisfaction. The latter results in a marginal Nagelkerke-R 2 of a bad 0.063. Now, if we look again at table 2, we might make a hard allocation of respondents to class 1, despite the fact that separation of the classes is not perfect. 6 The probability of belonging to class 1 is 67.94%, for class 2 17.94% and for class 3 14.12%. If the values of all concomitant variables are 1, the corresponding probabilities become 65.83%, 27% and 7.17%. If all other values of the concomitant variables are 0, a change from 0 to 1 in the brand variable, means that the odds to belong to class 1 compared to class 3 are just e 1.455 = 4.28. 476 Dorian Quint and Marcel Paulssen Table 3. ML-regression results Variable Value Std.error Z-Statistic Response Bias for all classes Satisfaction 0.944 0.164 5.749 ∗ Age ∗ Satisfaction 0.147 0.046 3.230 ∗ Consideration ∗ Satisfaction -0.244 0.113 -2.157 ∗ Brand 1 ∗ Satisfaction -0.367 0.100 -3.673 ∗ Age ∗ Brand 1 ∗ Satisfaction -0.131 0.056 -2.349 ∗ Consideration ∗ Brand 1 ∗ Satisfaction 0.254 0.123 2.075 ∗ Thresholds O 1 Threshold 3.174 0.963 3.297 ∗ O 2 Threshold 15.000 – – O 3 Threshold -15.000 – – Class 1: Concomitant Variables Value Std.error Z-Statistic D 1 Constant 1.573 1.272 1.237 Length 1.131 0.432 2.620 ∗ Owner -1.995 1.130 -1.765 Brand 1 1.455 0.635 2.292 ∗ Consideration 0.141 0.445 0.316 Dealer 0.912 0.746 1.223 Class 2: Concomitant Variables D 2 Constant 0.243 1.085 0.224 Length 1.049 0.443 2.369 ∗ Owner -0.440 1.009 -0.436 Brand 1 0.275 0.500 0.549 Consideration 1.199 0.299 4.004 ∗ Dealer -0.213 0.370 -0.577 ∗ significant on the 5% level For class 1 we then arrive at a very good Nagelkerke-R 2 value of 0.509. This means that the estimated model basically works as a filter leaving one group of respondents with a very strong relation between satisfaction and retention and two smaller groups with no relation at all. At this point the classes of the final model shall be interpreted. While average satisfaction ratings are essentially the same (6.77, 6.82 and 6.60 for classes 1 to 3), the relation between satisfaction and retention is very different. As indicated above, class 1 describes a filtered link between satisfaction with the replaced van and retention. This class contains predominantly respondents who are self-employed, who were involved in talks with the dealer, who had a long length of ownership of their previous van and who drove a van of brand 1. In this class increasing satisfaction corresponds to a higher retention rate. This means in turn that marketing measures to increase retention via satisfaction campaigns are feasible for this group. Respondents of class 2 considered brands other than the brand of their replaced van prior to their purchase decision, which increased the number of choices they had for making the purchase decision. However, this class can also be consid- Heterogeneity in the Satisfaction-Retention Relationship 477 ered as being influenced by other factors than were observed in our study. These factors might further explain why the retention rate is zero, although some members were in fact satisfied with their replaced van. It is easy to imagine that a large number of reasons, including pure coincidence, can lead to such a behavior. The third class, where respondents repurchase independent of their satisfaction, has at least one distinctive feature. This class is dominated by very short lengths of ownership, which might be explained by the presence of leasing contracts. 3 Discussion Previous studies have examined customer characteristics as moderating effects of the satisfaction-retention link. In order to further investigate this, we built on a model developed by Mittal and Kamakura (2001) that we expanded by including manufacturer and company characteristics as additional moderating variables. Previous research did not fully investigate the moderating role of manufacturer/brand and company characteristics on the satisfaction retention link. Furthermore, by apply- ing a concomitant logit mixture approach we applied a new research method to this problem. Our results imply that similar to findings of Mittal and Kamakura (2001) customer groups exist where repurchase behavior is completely invariant to rated satisfaction. In the largest customer group a strong relationship between satisfaction and repurchase was present. Respondents in this group were self-employed, participated in dealer talks and kept their commercial vehicles longer than members of the other classes. It is notable that for respondents who stated they were self-employed and participated in dealer talks the satisfaction-retention relationship is strong, indi- cating that those respondents had substantial leverage on decision making. That is, these respondents immediately punished bad performance of the incumbent brand and switched to other brands. For respondents that worked for companies other factors (purchasing policies of the company, satisfaction from other members of the buying center) than their stated satisfaction may play a role. It also seems to be necessary that the respondent had a significant involvement in the buying process as indicated by his participation in dealer talks. This result also points to limitation of the often applied key informant approach – key informants have to be carefully screened. It does not suffice to ask whether they participate in certain business decisions. References ANDERSON, E. W., FORNELL, C., MAZVANCHERYL, S. K. (2004): Customer Satisfac- tion and Shareholder Value. Journal of Marketing, 68, 172–185. BOLTON, R. N. (1998): A Dynamic Model of the Duration of the Customer’s Relationship with a Continuous Service Provider: The Role of Satisfaction. Marketing Science, 17, 45–65. FOLLMANN, D. A., LAMBERT, D. (1991): Identifiability of finite mixtures of logistic regression models. Journal of Statistical Planning and Inference, 27, 375–381. 478 Dorian Quint and Marcel Paulssen GIESE, J. L., COTE, J. A. (2000): Defining Consumer Satisfaction. Academy of Marketing Science Review, 2000, 1–24. GREMLER, D. D., BROWN, S. W. (1996): Service Loyalty: Its Nature, Importance, and Implications. Advancing Service Quality: A Global Perspective. International Service Quality Association, 171–180. HOMBURG, C., GIERING, A. (2001): Personal Characteristics as Moderators of the Rela- tionship Between Customer Satisfaction and Loyalty: An Empirical Analysis. Psychol- ogy & Marketing, 18, 43- ˝ U66. MCLACHLAN, G., PEEL, D. (2000): Finite Mixture Models. Wiley, New York. MITTAL, V., KAMAKURA, W. A. (2001): Satisfaction, Repurchase Intent, and Repurchase Behavior: Investigating the moderating Effect of Customer Characteristics. Journal of Marketing Research, 38, 131–142. MORWITZ, V. G. (1997): Why Consumers Don’t Always Accurately Predict Their Own Fu- ture Behavior. Marketing Letters, 8, 57–70. MUTHEN, L. K., MUTHEN, B. O. (2006): Mplus User’s Guide. Fourth issue, Los Angeles. NYLUND, K. L., ASPAROUHOV, T., MUTHEN, B. (2006): Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Accepted by Structural Equation Modeling. PAULSSEN, M., BIRK, M. (2006): It’s not demographics alone! How demographic, company characteristics and manufacturer moderate the satisfaction retention link. Humboldt- Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät. Working Paper. REICHHELD, F. F. (1996): Learning from Customer Defections. Harvard Business Review, 74, 56–69. RUST, R. T., ZAHORIK, A. J. (1993): Customer Satisfaction, Customer Retention, and Mar- ket Share. Journal of Retailing, 69, 193–215. SEIDERS, K., VOSS, G. B., GREWAL, D., GODFREY A. L. (2005): Do Satisfied Customers Buy More? Examining Moderating Influences in a Retailing Context. Journal of Market- ing, 68, 26–43. SÖDERLUND, M., VILGON, M. (1999): Customer Satisfaction and Links to Customer Prof- itability: An Empirical Examination of the Association Between Attitudes and Behavior. Stockholm School of Economics, Working Paper Series in Business Administration, Nr. 1999:1. [...]... Wolfgang Bessler and Peter Lückoff Random Walk (# 14) Linear Regression (# 2) 125 100 100 SE (in ’000) 150 125 SE (in ’000) 150 75 50 25 0 75 50 25 0 20 00 20 01 20 02 2003 20 04 20 00 20 01 AR(1)-Model (# 13) 20 04 125 100 100 SE (in ’000) SE (in ’000) 20 03 150 125 75 50 25 0 75 50 25 0 20 00 20 01 20 02 2003 20 04 20 00 Box Jenkins (# 10) 20 01 20 02 2003 20 04 BVAR(18)-Model (# 23 ) 150 125 125 100 100 SE (in ’000)... (in ’000) 20 02 VAR(4)-Model (# 17) 150 75 50 25 0 75 50 25 0 20 00 20 01 20 02 2003 20 04 20 00 20 01 20 02 2003 20 04 Fig 2 Squared forecasting errors for 1-step ahead forecasts over time BVAR(18)-Model (# 23 ) vs AR(1)-Model (# 13) SE (in ’000) 100 80 AR(1) BVAR(18) Diff 60 40 20 0 -20 20 00 20 01 20 02 2003 20 04 20 03 20 04 Return in % Return of Portfolio 27 0 180 90 0 -90 -180 -27 0 -360 20 00 20 01 20 02 Fig 3 Comparison... with = 0.3 and h = −0.551 based on reference samples of size m 10 ARL SDRL Q( .10) Q(.50) Q(.90) 0.0 3 42. 18 338.74 38 23 8 786 0.5 341. 42 338. 62 37 23 7 785 Shift Magnitude 1.0 1.5 2. 0 339. 42 334. 52 326 .63 338.77 338.89 338.54 35 30 22 23 6 23 0 22 2 784 779 771 2. 5 316.80 337.80 12 2 12 759 3.0 306. 92 337.35 5 20 1 749 28 ARL SDRL Q( .10) Q(.50) Q(.90) 199.77 193.98 25 140 456 196.56 193.91 21 137 4 52 183.44... 193.13 9 124 438 151.96 187.73 5 86 399 105 .25 169.44 3 12 325 59 .10 133.97 3 5 20 5 28 .04 93.43 3 4 44 100 ARL SDRL Q( .10) Q(.50) Q(.90) 185.15 176.11 24 133 414 170.90 175.05 15 118 398 118 .21 1 62. 56 6 40 329 43.15 104 .09 4 10 124 9.17 31.47 3 5 12 4. 32 4.67 3 4 6 3.47 0.91 3 3 5 20 0 ARL SDRL Q( .10) Q(.50) Q(.90) 188.05 177.44 23 138 420 160.85 173.60 14 99 389 76.68 131 .29 6 25 23 4 15.85 40.39 4 8 26 5.88... 4.87 3 5 10 4. 02 1.49 3 3 6 3.36 0.75 3 3 4 500 ARL SDRL Q( .10) Q(.50) Q(.90) 196 .22 185.11 24 141 445 138.36 163 .28 14 79 350 38.11 63.83 6 21 78 10. 40 8.37 4 8 20 5.47 2. 85 3 5 9 3. 92 1.34 3 3 6 3. 32 0.69 3 3 4 100 0 ARL SDRL Q( .10) Q(.50) Q(.90) 199.35 1 92. 86 24 141 455 119.63 140.93 13 73 28 0 29 .83 32. 33 6 20 65 9.91 7 .28 4 8 19 5.38 2. 73 3 5 9 3.88 1.31 3 3 6 3.31 0.67 3 3 4 ARL SDRL Q( .10) Q(.50)... AR(1)-Model (# 13) 15 VAR(4)-Model (# 17) 30 25 25 MSE (in ’000) MSE (in ’000) 10 Forecasting horizon 30 20 15 10 5 0 5 10 20 15 10 5 0 15 5 Forecasting horizon 10 15 Forecasting horizon Box Jenkins (# 10) BVAR(18)-Model (# 23 ) 30 25 25 MSE (in ’000) 30 MSE (in ’000) 505 20 15 10 5 0 5 10 Forecasting horizon 15 20 15 10 5 0 5 10 15 Forecasting horizon Fig 4 Squared forecasting errors for 1- to 15-step... samples of size m = 10, 28 , 100 , 20 0, 500, 100 0 and 100 00 (m ≈ ) Note that the desired in-control ( = 0) ARL performance is obtained using m = 28 This motivates this choice SDRL is the standard deviation of the run length Q( .10) , Q(.50), and Q(.90) are respectively the 10th, 50th, and 90th percentiles of the in-control and out-of-control RL distributions In the following, ARL0 and ARL1 are used to... 29 .83 32. 33 6 20 65 9.91 7 .28 4 8 19 5.38 2. 73 3 5 9 3.88 1.31 3 3 6 3.31 0.67 3 3 4 ARL SDRL Q( .10) Q(.50) Q(.90) 20 1.00 197.71 24 141 459 99. 02 98 .23 13 68 22 3 26 .16 23 . 12 6 19 56 9.58 6.66 4 8 18 5 .29 2. 61 3 5 9 3.85 1 .26 3 3 5 3 .29 0.65 3 3 4 m NOTE: ARL = average run length SDRL = standard deviation of run length distribution Q(q) = qth percentile of run length distribution Performance of rMEWMA... forecasting horizon of 12 months the MSE of the BVAR is about 3 percentage points smaller than the MSE of a naive forecast The superior results Predicting Stock Returns with Bayesian Vector Autoregressive Models Random Walk (# 14) Linear Regression (# 2) 30 25 25 MSE (in ’000) MSE (in ’000) 30 20 15 10 5 0 5 10 20 15 10 5 0 15 5 Forecasting horizon AR(1)-Model (# 13) 15 VAR(4)-Model (# 17) 30 25 25 MSE (in ’000)... assumed that tied data depth measures are not observed Thus, Qt∗ is uniformly distributed on the m points {1, 2, , m} The standardized sequential rank Qtm is given by Qtm = 2 m+1 Qt∗ − m 2 (2) It is uniformly distributed on the m points {1/m − 1, 3/m − 1, , 1 − 1/m} with 2 mean Qtm = 0 and variance Qtm = m −1 , see Hackl and Ledolter (19 92) 3m2 The control statistic Tt is the EWMA of standardized sequential . 338.89 338.54 337.80 337.35 10 Q( .10) 38 37 35 30 22 12 5 Q(.50) 23 8 23 7 23 6 23 0 22 2 21 2 20 1 Q(.90) 786 785 784 779 771 759 749 ARL 199.77 196.56 183.44 151.96 105 .25 59 .10 28 .04 SDRL 193.98 193.91. 3.31 SDRL 1 92. 86 140.93 32. 33 7 .28 2. 73 1.31 0.67 100 0 Q( .10) 24 1364333 Q(.50) 1417 320 8533 Q(.90) 45 528 06519964 ARL 20 1.00 99. 02 26.16 9.58 5 .29 3.85 3 .29 SDRL 197.71 98 .23 23 . 12 6.66 2. 61 1 .26 0.65 f. 93.43 28 Q( .10) 25 2195333 Q(.50) 140 137 124 86 12 5 4 Q(.90) 456 4 52 438 399 325 20 5 44 ARL 185.15 170 .90 118 .21 43.15 9.17 4. 32 3.47 SDRL 176.11 175.05 1 62. 56 104 .09 31.47 4.67 0.91 100 Q( .10) 24 1564333 Q(.50)