Design and analysis of computer experiments for stochastic systems

DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS FOR STOCHASTIC SYSTEMS YIN JUN (B.Eng., University of Science and Technology of China) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF INDUSTRIAL & SYSTEMS ENGINEERING NATIONAL UNIVERSITY OF SINGAPORE 2012 DECLARATION I hereby declare that the thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. YIN JUN June 2012 Acknowledgements First and foremost I offer my sincerest gratitude to my supervisor, A/Prof. NG Szu Hui, who has supported me thoughout my Ph.D study with her patience and encourage. I’m grateful for her suggestions and comments to all of my research work. All these would not have been possible without her efforts. I also would like to thank my co-supervisor, A/Prof. NG Kien Ming, for his kindly guidance and valuable suggestion during the writing of this thesis. My parents support me thoughout my entire study in China and Singapore. I was away from home for a long time and could not take good care of my family. I would like to offer my sincerely gratitude and love to them. All of my classmates and friends in Singapore, CHEN Ruifeng, HAN Dongling, LV Yang, LIU Xiangjun, LIU Jin, MU Aoran, XIONG Chengjie, YU Jinfeng, SHENG Xiaoming and Dr. Lim Yee Nah from NUH , I couldn’t get through without your encourage and help. Last but not the least, I want to say thank you to my wife ZHONG Ying, for all her understanding and support to my work. Abstract This thesis studies the design and analysis of computer experiment for stochastic simulations. The stochastic simulation models play an important role in modern industrial and managerial applications. However, its stochastic response increases the difficulties of conducting analysis and experiments. This thesis proposes the kriging metamodel with modified nugget effect as a solution to the more general stochastic simulation scenario with hetergeneous variances. The results suggest that the proposed model performs beter than the existing models by appropriately account for the influence of random noise in terms of model prediction and parameter estimation. The study on parameter estimation uncertainty problem with kriging metamodels in stochastic simulation is further investigated. Based on the proposed model, a two-stage optimization algorithm is also developed as the solution to stochastic simulation optimization for heteroscedastic case. The numerical results suggest that the proposed model can effective reduce the erratic behavior of the predictor by more appropriately accounting for the influence of the stochastic responses. Last, a Bayesian metamodeling and two-stage sequential design approach are also developed to overcome the parameter estimation uncertainty issue and efficiently use the limited computing budget in practice. Keywords: simulation, metamodels, optimization, design of experiment, stochastic systems, discrete event simulation Contents INTRODUCTION 1.1 Computer Simulation Model and Computer Experiments . . . . . 1 1.2 Deterministic Simulation Model and Computer Experiments . . . 1.3 Stochastic Simulation Model and Computer Experiments . . . . . 1.4 Objective and Scope . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LITERATURE REVIEW 12 2.1 Review of Metamodels . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Polynomial Regression Model . . . . . . . . . . . . . . . . 2.1.2 Spatial Correlation Model . . . . . . . . . . . . . . . . . . 2.1.3 2.1.4 12 12 13 Multivariate Adaptive Regression Splines Model . . . . . . Radial Basis Function Model . . . . . . . . . . . . . . . . . 14 15 2.1.5 Artificial Neural Network Model . . . . . . . . . . . . . . . 2.2 Review of Kriging Metamodel in Computer Experiments . . . . . 16 17 2.2.1 2.2.2 Kriging Metamodel in Homoscedastic case . . . . . . . . . Kriging Model in Heteroscedastic case . . . . . . . . . . . 18 20 2.3 Review of Design of Experiment for Computer Simulation . . . . 2.3.1 Space-filling Designs . . . . . . . . . . . . . . . . . . . . . 2.3.1.1 Latin hypercube design . . . . . . . . . . . . . . 22 23 23 2.3.1.2 2.3.1.3 Uniform design . . . . . . . . . . . . . . . . . . . Distance dependent design . . . . . . . . . . . . . 24 25 2.4 Designs Based on Optimization Criterion . . . . . . . . . . . . . . 2.4.1 Response surface methodology . . . . . . . . . . . . . . . . 25 25 2.4.2 Trust region method . . . . . . . . . . . . . . . . . . . . . iv 26 CONTENTS 2.4.3 Efficient global optimization . . . . . . . . . . . . . . . . . 27 KRIGING METAMODEL WITH MODIFIED NUGGET EF29 FECT 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Differences from the stochastic kriging model . . . . . . . . 29 34 3.1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Kriging Model with Modified Nugget Effect . . . . . . . . . . . . 35 35 3.2.1 3.2.2 3.2.3 Classic kriging (deterministic and nugget effect model) . . The development of kriging metamodel with modified nugget effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parameter estimation and characteristics of likelihood function with noisy data . . . . . . . . . . . . . . . . . . . . . 35 38 42 3.2.4 Error measurement . . . . . . . . . . . . . . . . . . . . . . 3.3 Prediction Performance of the Kriging Model with Modified Nugget- 47 effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Comparison through MSES . . . . . . . . . . . . . . . . . 3.3.2 Estimating predictor’s variance . . . . . . . . . . . . . . . 48 48 49 3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Test Function . . . . . . . . . . . . . . . . . . . . . . . . . 52 52 3.4.2 3.4.3 M/M/1 queueing system . . . . . . . . . . . . . . . . . . . PAD system . . . . . . . . . . . . . . . . . . . . . . . . . . 57 59 PARAMETER ESTIMATION FOR KRIGING METAMODEL IN STOCHASTIC SIMULATION 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Decomposition of the Overall Prediction Error for Stochastic Case 4.3 Maximum Likelihood Estimation with Stochastic Response . . . . 4.3.1 A simple two-point problem . . . . . . . . . . . . . . . . . 4.3.2 4.3.3 69 71 72 Analytical Results . . . . . . . . . . . . . . . . . . . . . . Influence of Parameter Estimation on Overall Prediction 73 Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . 75 76 4.4.1 One Dimension Quadratic Test Function . . . . . . . . . . v 76 CONTENTS 4.4.2 4.4.3 Two Dimension Linear Function . . . . . . . . . . . . . . . Two Dimension Sinusoidal Function . . . . . . . . . . . . . 78 79 OPTIMIZATION OF STOCHASTIC SIMULATIONS WITH KRIGING METAMODEL 84 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2 The expected improvement function . . . . . . . . . . . . . . . . . 5.3 Limitations of EGO and SKO in Noisy Heteroscedastic Situations 86 87 5.3.1 Characteristics of Good Algorithms and Criteria . . . . . . 5.4 Development of Methodology . . . . . . . . . . . . . . . . . . . . 5.4.1 The search stage . . . . . . . . . . . . . . . . . . . . . . . 91 92 93 5.4.2 5.4.3 The allocation stage . . . . . . . . . . . . . . . . . . . . . An algorithm overview . . . . . . . . . . . . . . . . . . . . 93 95 5.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.5.1 Single dimension test function(Comparative study) . . . . 100 5.5.2 Two Dimension Keys and Reese (2004) Function (Comparative Study) . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.6 Ocean Liner Example . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 BAYESIAN METAMODELING AND DESIGN APPROACH 115 FOR STOCHASTIC SIMULATIONS 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.2 Model Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.2.1 6.2.2 Modeling Uncertainty . . . . . . . . . . . . . . . . . . . . . 119 Observed Data . . . . . . . . . . . . . . . . . . . . . . . . 120 6.2.3 Bayesian Prediction and Predictive Distribution . . . . . . 120 6.2.3.1 Derivation of the Predictive Distribution (Assuming φZ is known) . . . . . . . . . . . . . . . . . . 121 6.2.3.2 6.2.3.3 Modeling of σξ2 . . . . . . . . . . . . . . . . . . . 121 A further simplification of Equation (6.6) . . . . 124 6.2.3.4 A General Approach to Deriving the Predictive Distribution (when all parameters are unknown) . 127 6.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . 128 vi CONTENTS 6.3.1 6.3.2 The Simple Quadratic Function . . . . . . . . . . . . . . . 128 The M/M/1 System . . . . . . . . . . . . . . . . . . . . . 132 6.4 Sequential Experimental Design Approach . . . . . . . . . . . . . 135 6.4.1 The two stage design framework . . . . . . . . . . . . . . . 135 6.4.2 6.4.3 A follow-up design criterion . . . . . . . . . . . . . . . . . 136 Simplification and decomposition of the IMSPE . . . . . . 137 6.4.3.1 6.4.3.2 6.4.4 A simplified Stage design for the two point example . . . . . . . . . . . . . . . . . . . . . . . . 139 A numerical study on the EIMSPE for different design options . . . . . . . . . . . . . . . . . . . . 142 Improved two-stage design approaches . . . . . . . . . . . 146 6.4.4.1 One-Point-at-A-Time (OPAT) sequential design approach . . . . . . . . . . . . . . . . . . . . . . 146 6.4.4.2 Simple two-stage design approach . . . . . . . . . 148 6.5 Comments and Conclusions . . . . . . . . . . . . . . . . . . . . . 150 CONCLUSION 152 7.1 Main findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 7.2 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 References 169 A Kriging predictor and kriging variance for heteroscedastic model170 B MSE for the modified nugget-effect and nugget-effect model 172 C Proof for the two-stage algorithm 175 D Details of the two-point example 178 E Estimating Predictor Variance by Delta Method 180 F Proof for Proposition 182 G Posterior distribution of the parameters 184 vii CONTENTS H Posterior distribution of σZ2 188 viii Appendix B MSE for the modified nugget-effect and nugget-effect model The MSES at the ith observation point for the kriging model with modified nugget-effect is MSES (xi ) = = + = + = + = (1 − F T R −1 c) T −1 (1 − F T R −1 c) ι1 − (c + F ) R c+ − ι∗xi T −1 T −1 F R F F R F T −1 T −1 −1 (1 − F R c) + F R (R + η −1 )−1 R−1 c T −1 ) R c ι1 [1 − (c + F F T R −1 F (1 − F T R−1 c) + F T R−1 (R−1 + η −1 )−1 R−1 c ] − ι∗xi F T R −1 F F T R−1 (R−1 + η −1 )−1 R−1 c T −1 ι1 [1 − (c + F ) R c F T R −1 F F T R−1 (R−1 + η −1 )−1 R−1 c ] − ι∗xi T −1 F R F F T R−1 (R−1 + η −1 )−1 η −1 ηR−1 c T −1 ι1 [1 − (c + F ) R c F T R −1 F F T R−1 (R−1 + η −1 )−1 η −1 ηR−1 c ] − ι∗xi T −1 F R F F T R −1 ηei F T R −1 ηei ι1 [1 − (c + F T −1 )T R −1 c + T −1 ] − ι∗xi F R F F R F 173 Let ∆m = F T R −1 F indicate the summation of all the elements in the inverse correlation matrix, and let ∆mi represent the summation of the ith column or row. Then we have MSES (xi ) = ι1 − (c + F ηi ∆mi T −1 ηi ∆mi ) R c+ − ι∗xi ∆m ∆m m = ι1 [1 − − cj ηi ( j=1 ∆mi ∆mj ηi ∆mi − ∆m ) + ] − ι∗x1 ∆m ∆m m = ι1 [−ηi ( ∆mi ηi ∆m i − 1) cj δmj + ] − ι∗x1 ∆m ∆ m j=1 ηi ∆m i ∆mi − 1)(1 − ηi ∆mi ) + ] − ι∗x1 ∆m ∆m ∆mi ∆2mi ηi ∆m i = ι1 [−ηi ( − − ηi + ηi ∆i ) + ] − ι∗x1 ∆m ∆m ∆m ∆2 = ι1 [ηi + ηi2 mi − ηi2 ∆mi ] − ι∗x1 ∆m = ι1 [−ηi ( For the modified nugget-effect model, ηi = ι∗xi /ι1 , so MSES (xi ) = ι1 (ηi2 174 ∆2mi − ηi2 ∆mi ) ∆m Appendix C Proof for the two-stage algorithm Assuming a single dimension optimization problem can be given as finding Z ∗ (x) = Z(x) x∈[xa ,xb ] Here x is the input variable and Z(x) is the response. xa and xb are the lower and upper bound of the design space. Based on the algorithm given in Section 5.4.3, we consider the case when the total computing budget T is large enough and the stopping criterion for the algorithm is mEI(x∗ ) < l, and x∗ indicates the current best location that maximizes the mEI function. Now assuming we have n observations Z(x1 ), Z(x2 ), ., Z(xn ) at locations x1 , x2 , ., xn , for the mEI function values at these observed locations, we have mEI(xi )i=1,2, .,n = (Zmin (x) − Zˆm (xi ))Φ + sZ (xi )φ Zmin (x) − Zˆm (xi ) sZ (xi ) Zmin (x) − Zˆm (xi ) sZ (xi ) Here Zˆm (x0 ) indicates the MNEK predictor, and sZ (x0 ) is the standard deviation of spatial uncertainty. Since the spatial uncertainty at the observed location is Zˆm (xi ) = zero, so sZ (xi )i=1,2, .,n = 0. It can easily be shown that sZ (xi )φ Zmin (x)− sZ (xi ) i = 1, 2, ., n. Since Zmin (x) is the current best value, we have Zmin (x) < 175 Zmin (x)−Zˆm (xi ) sZ (xi ) Zˆm (xi )i=1,2, .,n, which suggests that Φ mEI(xi ) = = and hence i = 1, 2, ., n Now we know that the mEI function equals to zero on the design points x1 , x2 , ., xn . According to the algorithm in Section 5.4.3, the stopping criterion of the algorithm is mEI(x∗ ) ≤ l, which suggests that the algorithm will not sample in the region with mEI function value lower than l. Given that ∂mEI(x) Zmin (x) − Zˆm (x0 ) =φ >0 ∂sZ (x) sZ (x) mEI function is monotonic increasing in sZ (x). Moreover, we know that the spatial uncertainty s2Z (x0 ) equals to zero when x0 = xi , i = 1, 2, ., n, and increases as the minimum distance between the prediction point x0 and existing design point xi , d0i = |x0 − xi |i=1,2, .,n increases. Hence we can identify a small region Di around the existing design point xi where ∀x0 ∈ Di , mEI(x0 ) < l. The length of this small region can be given as d˜i = |xui − xli | where xui and xli satisfy sZ (xli ) sZ (xui ) (1 − F T RZ−1 c(xui ))2 √ = √ = √ σZ2 − c(xui )T RZ−1 c(xui ) =l F T RZ−1 F 2π 2π 2π . Since mEI(x0 )i=1,2, .,n = (Zmin (x) − Zˆm (x0 ))Φ Zmin (x) − Zˆm (x0 ) sZ (x0 ) + sZ (x0 )φ Zmin (x) − Zˆm (x0 ) sZ (x0 ) ≤ sZ (x0 )φ Zmin (x) − Zˆm (x0 ) sZ (x0 ) sZ (x0 ) ≤ sZ (x0 )φ (0) = √ 2π We can derive that for all the x0 in Di bounded by xli and xui , mEI(x0 ) ≤ l. 176 So given stopping criterion l, the maximum number of design points will be a finite value of b−a nmax = d˜i (l) i=1,2, .,n When l → 0, xli = xui → xi , and hence d˜i → 0, nmax → ∞ Given the mEI function can visit any possible unobserved location in [a, b], we have lim max (xi − xi−1 ) = n→∞ i=1,2, .,n The observed set of design points is dense in [a, b], we can show that lim Zn∗ = Z ∗ n→∞ According to the algorithm, once the mEI(x∗ ) ≤ l, the algorithm will stop observing new design points and distribute all the remaining computing budget to the existing design points. Since we adopt the OCBA for the Allocation Stage, this can guarantee the algorithm will converge to the optimum by the end when n → ∞. The convergence property of OCBA can be found in Chen et al. (2000). 177 Appendix D Details of the two-point example For the two point example in Figure 6.3, with the linear correlation function and the constant underlying mean function for the Z process, we have Σ= σZ2 σZ2 ρ0 σZ2 ρ0 σZ2 σZ2 rτ10 + σZ2 τr20 , + ρ0 + ρ0 , ), 2 F T = (1, 1), F (x0) = 1. v T = σZ2 ( Suppose an equal number of replications r0 are taken at both design points and DnT = [YnT , Sn2T ], where YnT = (y1 , y2), Sn2T = (s21 , s22 ). With homogeneous variance and sample variances approximately s21 = s22 = s20 , Sn2T = (s20 , s20 ). Assuming the following non informative priors (Jeffreys Prior) p(β) = 1, p(σZ2 ) = 1/σZ2 , p(τ ) = 178 2(1 + τ )2 the posterior predictive distribution of Z(x0 ) can be obtained following Equation (6.6). f (Z(x0 )|Dn ) = f (Z(x0 )|β, τ, σZ2 , Dn )f (β|σZ2 , Dn )f (τ |σZ2 , Dn )f (σZ2 |Dn ) dβdσZ2 dτ ∝ f (Z(x0 )|β, τ, σZ2 , Dn )f (Yn |β, τ, σZ2 )f (s20 |β, τ, σZ2 )f (β|σZ2 ) f (τ |σZ2 )f (σZ2 )dβdσZ2 dτ where the likelihoods f (Yn |β, τ, σZ2 ) follows a normal distribution and f (s20 |β, τ, σZ2 ) follows the Pearson type III distribution, see Chapter 11, Stuart & Andord (1994). f (s20 ) ∝ r0 (r0 −1)/2 (r0 −3)/2 (s0 ) exp 2τ σZ2 With f (β|σZ2 ) ∝ 1, f (τ |σZ2 ) ∝ f (Z(x0 )|Dn ) ∝ −r0 s20 2τ σZ2 , 2(1+τ )2 (σZ2 )−1/2 exp(− (Z(x0 ) − F (x0 )β − cT (x0 )R−1 (Yn − F β))2 )· σZ2 (1 − cT (x0 )R−1 c(x0 )) (σZ2 )−1 exp(− (Yn − F β)T Σ−1 (Yn − F β)) · r0 +1 r0 s20 ) dβdσZ2 dτ (σZ2 )− exp(− 2τ σZ 2(1 + τ )2 Following the simplification steps in Appendix F, where −r0 (y1 + y2 − 2Z(x0 ))(1 + ρ0 ) + 2Zτ 2(r0 + r0 ρ0 + τ ) τ k = cT (x0 )R−1 F − F T (x0 ) = − r0 + r0 ρ0 + τ T 2(r0 − r0 ρ0 + τ ) k k T −1 + F R F = A = − cT (x0 )R−1 c(x0 ) r0 − r0 ρ20 + 2τ pT k r0 (y1 + y2 )(1 − ρ) + 2Zτ B = + YnT R−1 F = T −1 − c (x0 )R c(x0 ) r0 (1 − ρ20 ) + 2τ p = Z(x0 ) − cT (x0 )R−1 Yn = we get the posterior predictive distribution as √ pT p + YnT Σ−1 Yn ))−1/2 f (Z(x0 )|Dn ) ∝ 2πA−1 ( BT A−1 B − ( 2σZ σZ − v T Σ−1 v dσZ2 dτ − 1+r  y1 +y2 (Z(x0 ) − )  ∝ 1 + (y −y )2 s2 s20 r0 (1−ρ0 ) + r0 + 2[ln(r0 (1−ρ0 ))(1−ρ0 )r0 +r0 ρ0 ] ) 2( 179 Appendix E Estimating Predictor Variance by Delta Method Chapter provides a Bayesian perspective of the kriging metamodeling for stochastic simulation in order to better account for the parameter estimation uncertainties. Alternative approaches can be used to estimate the predictor’s uncertainties with unknown parameters, such as Delta method. As stated in Section 6.2.1, the typical unknown parameters for the kriging metamodeling can be given as θ = (β, σZ2 , φZ , σξ2 ). If we use MLE to estimate ˆ σ all these parameters, the estimators β, ˆ , φˆZ , σ ˆ will be asymptotically normal. Z ξ The delta method can be used to derived an approximation of the probability distribution of the predictor considering it as a function of all these estimated parameters, see Agresti (2002) for the details on delta method. Based on the predictor form in Eq. (3.17), we can write the predictor as: ˆ σ ˆ ˆ , β, Z(x ˆZ2 , φˆZ , σ ˆξ2 ) = F (x0 )βˆ + σ ˆZ2 c(x0 , φˆZ )T (ˆ σZ2 RZ (φˆZ ) + σˆξ2 Rε )−1 (Y¯ − F β) (E.1) Given the MLE of these estimator are asymptotically normal and ˆ σ θˆ = (β, ˆZ2 , φˆZ , σ ˆξ2 ), 180 we could have the approximation of the predictor variance shown in ˆ ˆ = ∇Z( ˆ T Σθ ∇Z( ˆ θ) ˆ , θ)) ˆ θ) Var(Z(x n (E.2) where θˆ ∼ N(θ∗ , Σθ ), θ∗ is the true value of the unknown parameter and Σθ is the covariance function. For the simple two point problem in Section 6.2.3.2, we assume that the parameters φZ are known. Moreover, we know that the estimators for the unknown parameters have asymptotically normal distribution θˆ ∼ N(θ∗ , Σθ ), and Σθ can be given as the inverse of Fisher information matrix on θ when some of the regularity conditions are satisfied, see Miura (2011) for details. Based on the assumptions of the unknown parameters β, σZ2 and σξ2 , we have the Fisher information matrix as  −E β   I(θ) = I σZ2  = −E  σξ2   ∂2l ∂β∂β −E ∂2l ∂β ∂σZ −E ∂2l ∂β∂σZ ∂2l ∂σ ∂σZ Z −E The predictor can be given as ˆ ) = βˆ + σ Z(x ˆZ2 1+ρ0 1+ρ0 T σ ˆZ2 + σ ˆξ2 /r0 σ ˆZ2 ρ0 2 σ ˆZ ρ0 σ ˆZ + σ ˆξ2 /r0  ∂2l ∂σξ2 ∂σξ2     y1 − βˆ y2 − βˆ The predictor variance can be derived as I −1 ˆ ˆ ∇Z(θ, x0 ) (1 − ρ0 )∆y s2 = + 2r0 (1 + ρ0 ) r0 ˆ x0 )T ˆ )) = ∇Z( ˆ θ, Var(Z(x + r02 (1 − ρ20 )(r0 ∆y − 4s20 ) − 16s20 (1 + ρ0 )(4s20 + r0 ∆y (1 − ρ0 )) 2r0 (1 + ρ0 ) where ∆y = (y1 − y2 )2 . Compared with the results in Equation (6.7), we can see that the predictor variance given by Delta method still consists of three components, representing the uncertainties from observations y1 and y2 , random noise s20 and the mixed effect of estimating σZ2 and σξ2 . However the mixed effect component is slightly different from the results in Section 6.2.3.2, this is probably due to the prior setting used in Section 6.2.3.2. 181 Appendix F Proof for Proposition If τ is known, the posterior distribution in (6.6) can be given as f (Z(x0 )|Yn ) = f (Z(x0 )|Yn , β, σZ2 )f (Yn |β, σZ2 )f (β|σZ2 )f (σZ2 )dβdσZ2 (Z(x0 ) − F T (x0 )β − v T (x0 )Σ−1 (Yn − F β))2 ) σZ2 − v T (x0 )Σ−1 v(x0 ) (σZ2 )−(p+1)/2 exp(− (Yn − F T β)T Σ−1 (Yn − F T β)) −1 γz T Q0 −αz −n/2−1 (β − w )) exp(− )dβdσZ2 (σZ ) exp(− (β − w0 ) 2 σZ σZ T T = exp(− β Aβ + B β)(σZ2 )−αz −n/2−3/2−(p+1)/2 2σZ σZ −1 2γz T −1 T Q0 exp(− ( w0 + )) + Y Σ Y + w n n −1 T σZ − v (x0 )Σ v(x0 ) σZ σZ ∝ (σZ2 )−1/2 exp(− dβdσZ2 √ (σZ2 )−αz −n/2−3/2 det 2πA−1 exp( BT A−1 B 2σZ p − ( 2 σZ − v T (x0 )Σ−1 v(x0 ) Q−1 2γz + YnT Σ−1 Yn + w0T 02 w0 + ))dσZ2 σZ σZ = 182 where kT k A = ( + F T (x)R−1 F (x) + Q−1 ) T −1 − c (x0 )R c(x0 ) pT k + YnT R−1 F + w0 Q−1 B = ( ) − cT (x0 )R−1 c(x0 ) p = Z(x0 ) − cT (x0 )R−1 Yn , k = cT (x0 )R−1 F − F T (x0 ) Σ = σZ2 R = σZ2 (RZ + Rξ ) v(x0 ) = σZ2 c(x0 ) Collecting all terms involving σZ2 , we have f (Z(x0 )|Yn ) ∝ (σZ2 )−αz −n/2−3/2 exp(− Ω )dσZ2 σZ2 ∝ Ω−αz −n/2−1/2 (Z(x0 ) − µp (x0 ))2 ∝ 1+ 2αz + n σp2 (x0 ) −αz −n/2−1/2 where 1 p2 Ω = ( BT A−1 B − ( + YnT R−1 Yn + w0T Q−1 w0 + 2γz ) 2 − cT (x0 )R−1 c(x0 ) µp (x0 ) = F (x0 )λ/M + cT (x0 )R−1 (Yn − F λ/M) T 2γz + YnT R−1 Yn + w0T Q−1 w0 − λ λ/M σp2 (x0 ) = (1 − cT (x0 )R−1 c(x0 ) + kT k/M) 2αz + n T −1 −1 λ = F T R−1 Yn + w0 Q−1 , M = F R F + Q0 183 Appendix G Posterior distribution of the parameters To derive the distribution f (σξ2 (x0 )|S2n , φV ), we first consider the distribution of the log of the noise level. As we place an independent Gaussian process prior over the log of the noise level, if assuming the observations S2n are sufficiently accurate to ignore the sampling noise and an exact interpolation model is adequate, then from the results in Santner et al. (2003), p. 95, the conditional distribution f (Vx0 |S2n , φV ) is a non central t distribution with mean µV and covariance ΣV . Assuming Vn follows multivariate normal distribution with mean F βV and covariance matrix σV2 RV , and let cV (x0 ) and RV represent the correlation vector and the correlation matrix (which are functions of φV ), and further assume prior distributions for parameters βV and σV2 : f (βV , σV2 , φV ) = f (βV |σV2 )f (σV2 )f (φV ) f (βV |σV2 ) ∼ Np (wV , σV2 QV ) f (σV2 ) ∼ IG(αV , γV ) f (φV ) ∼ exp(−bV φV )(φV )aV −1 Then the conditional can be given as f (Vx0 |S2n , φV ) ∝ T1 (n + αV , µV , ΣV ) 184 where µV = F (x0 )T ΣV F T RV−1 S2n + Q−1 F T RV−1 S2n + Q−1 −1 V bV V bV + c (x )R ) (S − V n V −1 −1 −1 T T F RV F + QV F RV F + Q−1 V 2γV + κT ((F T RV−1 F )−1 + Q0 )−1 κ n + 2αV −1 −1 Sn (RV − RV F (F T RV−1 F )−1 F T RV−1 )S2n + n + 2αV −1 T T −1 (cT V RV F − F (x0 )) (cV RV F − F (x0 )) −1 (1 − cT R c + ) V V V F T RV−1 F = κ = bV − F T RV−1 S2n F T RV−1 F Since Vx0 = ln(σξ2 (x0 )), with f (Vx0 |S2n , φV ), the conditional distribution can be obtained as f (Vx0 |S2n , φV ) f (σξ2 (x0 )|S2n , φV ) = σξ2 (x0 ) The conditional distribution of the parameters φV can be given as f (σV2 |S2n , φV ) ∝ f (S2n |β, σV2 , φV )f (βV |σV2 )f (σV2 )dβV f (φV |S2n , σV2 ) ∝ f (S2n |β, σV2 , φV )f (βV |σV2 )f (φV )dβV Given the prior distribution in (G.1), the conditional distribution of the pa- 185 rameters σV2 and φV can be given as f (σV2 |S2n , (φV )i ) ∝ (det(σV2 MV ((φV )i )RV ((φV )i )))−1/2 exp( λV ((φV )i )T 2σV2 MV−1 ((φV )i ) λV ((φV )i ) − − −1 2T −1 i T QV S R ((φ ) )S − w wV V n n V V σV2 σV2 γV )(σV2 )1/2−αV exp(−bV (φV )i ) σV2 f (φV |S2n , (σV2 )i−1 ) ∝ (det((σV2 )i−1 MV RV ))−1/2 exp( − (σV2 )i−1 −wVT 2(σV2 )i−1 −1 λT V MV λV Sn2T RV−1 S2n Q−1 γV V w − i−1 ) exp(−bV φV )(φV )aV −1 i−1 V (σV ) (σV ) where T −1 −1 λV (φV ) = F T RV−1 (φV )S2n + wV Q−1 V , MV (φV ) = F RV (φV )F + QV The conditional distribution of the parameters σZ2 and φZ can be given as f (σZ2 |Yn , φZ ) ∝ f (Yn |β, σZ2 , φZ )f (β|σZ2 )f (σZ2 )dβ f (φZ |Yn , σZ2 ) ∝ f (Yn |β, σZ2 , φZ )f (β|σZ2 )f (φZ )dβ With the prior distributions in (6.8), the integrand above can be simplified to f (Yn |β, σZ2 , φZ ) ∝ (det Σ)−1/2 exp(− (Yn − F β)T Σ−1 (Yn − F β)) −1 2 −1/2 T Q0 f (β|σZ ) ∝ (σZ ) exp(− (β − w0 ) (β − w0 )) σZ2 γz f (σZ2 ) ∝ exp(− )(σZ2 )−αz +1 σZ f (φZ ) ∝ exp(−bz φZ )(φZ )az −1 186 Therefore the conditionals can be obtained as f (σZ2 |Yn , (φZ )i ) ∝ (det(σZ2 M((φZ )i )R((φZ )i )))−1/2 exp( M −1 ((φZ )i )λ((φZ )i ) − −1 T Q0 −w0 w0 σZ λ((φZ )i )T 2σZ T −1 Y R ((φZ )i )Yn σZ2 n γz )(σZ2 )1/2−αz exp(−bz (φZ )i ) σZ2 λT M −1 λ f (φZ |Yn , (σZ2 )i−1 ) ∝ (det((σZ2 )i−1 MR))−1/2 exp( i−1 2(σZ ) − Y T R−1 Yn (σZ2 )i−1 n Q−1 γz −w0T i−1 w0 − i−1 ) exp(−bz φZ )(φZ )az −1 (σZ ) (σZ ) − where T −1 −1 λ(φZ ) = F T R−1 (φZ )Yn + w0 Q−1 , M(φZ ) = F R (φZ )F + Q0 As for the direct sampling from these posterior distributions can be difficult, we adopt the rejection-acceptance method for the posterior distribution sampling within the Gibbs loop for the MCMC implement. 187 Appendix H Posterior distribution of σZ The posterior distribution of σZ2 can be given as follows f (σZ2 |yD ) = f (yD |β, σZ2 )f (β|σZ2 )f (σZ2 )dβ det(Σ)−1/2 exp − (yD − F β)T Σ−1 (yD − F β) −1 Q γz (σZ2 )1/2−αz −(1+p)/2 dβ exp − (β − w0 )T 02 (β − w0 ) − σZ 2αz 1 T −1 Q−1 ∝ det(MΣ)−1/2 exp[ λT M −1 λ − yD R yD − w0T 02 w0 2σZ 2σZ σZ γz ](σZ2 )1/2−αz − σZ AG BG = ( )−αz −n/2−3/2 exp − σZ σZ = where AG = det((F T RD F )RD )−1/2 −1 T −1 T 2γz + w0T Q−1 w0 + yD RD yD − λD MD λD Therefore, the posterior distribution is an inverse gamma distribution with mean T R−1 y +w T Q−1 w −λT M −1 λ 2γz +yD D 0 D D D D , where 2αz +n BG = −1 T −1 −1 λD = FDT RD yD + w0 Q−1 , MD = FD RD FD + Q0 . 188 [...]... the design and metamodeling methods for the Design and Analysis of Computer Experiments( DACE) for stochastic systems In this chapter, we first briefly introduce the background and development of the computer simulation model and computer experiments in Section 1.1 Following, Section 1.2 will trace the development of metamodels, DACE for deterministic systems Section 1.3 will review the development and. .. manner for the characteristics of the computer program itself • The metamodel is always involved in the design of computer experiment to provide a simplified version of the simulation model for higher computational efficiency and lower running cost 22 2.3 Review of Design of Experiment for Computer Simulation Considering the design of experiment for computer simulation with metamodel, the main objective of. .. and stochastic kriging model provides a promising approach to handle the stochastic inputs with heterogeneous variance However, other issues like the parameter estimation and experimental design still need further investigation for the heteroscedastic case 21 2.3 Review of Design of Experiment for Computer Simulation 2.3 Review of Design of Experiment for Computer Simulation Experimenters use the design. .. criterion-based type of design 2.3.1 Space-filling Designs The main purpose of the space-filling designs it to evenly spread out the design points in the sample space to obtain satisfactory estimation of the target simulation model with less bias and lower variation For the space-filling design, the dispersion of the design points determines the characteristic and performance of the design Assuming the design Dn... simple random sampling method by reducing variation of the sample data and it is relatively easy to realize with computer simulation model Hence LHD is one of the most popular design methods in computer experiment 2.3.1.2 Uniform design Fang (1980) and Wang & Fang (1981) proposed the uniform design as an alternative choice for the space-filling design Different from the randomly generated LHD, uniform design. .. simulation model and the related design of experiment issues Computer simulation model is also ap- 1 1.1 Computer Simulation Model and Computer Experiments plied in meteorological and environmental research, see Watson & Johnson (2004) and Chin & Melone (1999) Computer simulation softwares based on the Finite Element Method(FEM) are popular in Computer Aided Design( CAD) for many engineering design problem,... deterministic and stochastic simulation models leads to different design and analysis approaches for computer experiments In the next section, we will first look into the development of deterministic simulation model and computer experiments 1.2 Deterministic Simulation Model and Computer Experiments Deterministic simulation model are commonly used in the cases where underlying mechanism or averaged behavior of. .. budget and prior information about the random noise For the more general stochastic computer experiments with heterogeneous variance, a suitable model has yet to be found For the computer experiments with the stochastic simulation model, the basic idea is close to conducting experiment on the real physical systems due to the existence of randomness Techniques like replication, blocking and randomization... review of the different approaches for simulation optimization Huang et al (2006) adapted the EGO scheme for stochastic simulation models and proposed the Sequential Kriging Optimization (SKO) method for optimizing stochastic systems With the nugget effect Kriging model and augmented EI 6 1.3 Stochastic Simulation Model and Computer Experiments function, the SKO algorithm accounts for the influence of random... system with a satisfactory accuracy level Hence, the validation and calibration of the computer simulation models are essential for the actual practice How to reduce the differences between the finding of computer experiments and the true mechanism of the real world systems becomes the key problem for the research of computer experiments Computer simulation models can be categorized in different ways, . the design and metamodeling methods for the Design and Analysis of Computer Experiments( DACE) for stochastic systems. In this chapter, we first briefly introduce the background and development of. DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS FOR STOCHASTIC SYSTEMS YIN JUN (B.Eng., University of Science and Technology of China) . simulation Contents 1INTRODUCTION 1 1.1 Computer Simulation Model and Computer Experiments . . . . . 1 1.2 Deterministic Simulation Model and Computer Exp eriments . . . 3 1.3 Stochastic Simulation Model and Computer Experiments

Design and analysis of computer experiments for stochastic systems

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan