SIMULATION AND THE MONTE CARLO METHOD Episode 3 potx

40 PRELIMINARIES 2. Solve (for fixed X and p) min qp, A, 0) VpL(P, A, P) = 0 I P by solving which gives the set of equations (1.83) Denote the optimal solution and the optimal function value obtained from the program (1.83) as p(A, p) and L* (A, p), respectively. The latter is the Lagrange dual function. We have m pk(X,p)=qkexp -/3-1+xXiSi(xk) , k=l, , T. (1.84) ( i= 1 Since the sum of the {pk} must be I, we obtain (1.85) Substituting p(X, p) back into the Lagrangian gives m L*(X,P) = -1 +>:X,Y, - p. (1.86) 1=1 3. Solve the dual program max L* (A, P) X,P (1.87) Since ,f3 and X are related via (1 .85), solving (1 37) can be done by substituting the corresponding p(X) into (1.86) and optimizing the resulting function: m m D(X)= -1fxXi~i-In exp{-l+~~,XiS;(xk)} i=l Since D(X) is continuously differentiable and concave with respect to A, we can derive the optimal solution, A‘, by solving VXD(X) = 0, (1.89) which can be written componentwise in the following explicit form: PROBLEMS 41 for j = 1, . . . , m. The optimal vector A* = (AT, . . . , A;) can be found by solving (1.90) numerically. Note that if the primal program has a nonempty interior optimal solution, then the dual program has an optimal solution A’. 4. Finally, substitute X = A’ and /3 = p(X*) back into (1.84) to obtain the solution to the original MinxEnt program. It is important to note that we do not need to explicitly impose the conditions pi 2 0, i = 1,. . . , n, because the quantities {pi} in (1.84) are automatically strictly positive. This is a crucial property of the CE distance; see also [2]. It is instructive (see Problem 1.37) to verify how adding the nonnegativity constraints affects the above procedure. When inequality constraints IE,[Si(X)] 2 yi are used in (1.80) instead of equality constraints, the solution procedure remains almost the same. The only difference is that the Lagrange multiplier vector X must now be nonnegative. It follows that the dual program becomes max D(X) x subject to: X 2 0 , with D(X) given in (1.88). A further generalization is to replace the above discrete optimization problem with afunctional optimization problem. This topic will be discussed in Chapter 9. In particular, Section 9.5 deals with the MinxEnt method, which involves a functional MinxEnt problem. PROBLEMS Probability Theory 1.1 nition 1.1.1 (here A and B are events): Prove the following results, using the properties of the probability measure in Defi- a) P(AC) = 1 - P(A). b) P(A U B) = P(A) + P(B) - P(A n B). 1.2 Prove the product rule (1.4) for the case of three events. 1.3 We draw three balls consecutively from a bowl containing exactly five white and five black balls, without putting them back. What is the probability that all drawn balls will be black? 1.4 Consider the random experiment where we toss a biased coin until heads comes up. Suppose the probability of heads on any one toss is p. Let X be the number of tosses required. Show that X - G(p). 1.5 5. Let N be the number of people queried until we get a “duplicate” birthday. In a room with many people, we ask each person hisher birthday, for example May a) Calculate P(N > n), n = 0,1,2,. . b) For which n do we have P(N < n) 2 1/2? c) Use a computer to calculate IE[N]. 42 PRELIMINARIES 1.6 random variables that are derived from X and Y via the linear transformation Let X and Y be independent standard normal random variables, and let U and V be sina -s;~sa) (c) (;) = a) Derive the joint pdf of U and V. b) Show that U and V are independent and standard normally distributed. Let X - Exp(X). Show that the memorylessproperty holds: for all s, t 2 0, 1.7 P(X > t-tsJX > t) =P(X > s). 1.8 Let X1, X2, X3 be independent Bernoulli random variables with success probabilities 1/2,1/3, and 1/4, respectively. Give their conditional joint pdf, given that Xi +X2 +X3 = 2. 1.9 1.10 Verify the expectations and variances in Table 1.3. Let X and Y have joint density f given by f(z,y) = czy, 0 6 y 6 5, 0 < z < 1. a) Determine the normalization constant c. b) Determine P(X + 2 Y < 1). 1.11 Let X - Exp(X) and Y N Exp(p) be independent. Show that a) min(X, Y) - Exp(X + p), b) P(X < Y I min(X,Y)) = - x x + 1,. 1.12 1.13 fact that the variance of aX + Y is always non-negative, for any a.] 1.14 Consider Examples 1.1 and 1.2. Define X as the function that assigns the number 21 + . . . + zn to each outcome w = (21, . . . , zn). The event that there are exactly k heads in 71. throws can be written as Verify the properties of variance and covariance in Table 1.4. Show that the correlation coefficient always lies between -1 and 1. [Hint, use the {w E R : X(w) = k} . If we abbreviate this to {X = k}, and further abbreviate P({ X = k}) to P(X = k), then we obtain exactly (1.7). Verify that one can always view random variables in this way, that is, as real-valued functions on s2, and that probabilities such as P(X 6 z) should be interpreted as P({w E R : X(w) 6 x}). 1.15 Show that /n \ n 1.16 Let C be the covariance matrix of a random column vector X. Write Y = X - p, where p is the expectation vector of X. Hence, C = IEIYYT]. Show that C is positive semidefinite. That is, for any vector u, we have uTCu 2 0. PROBLEMS 43 1.17 Suppose Y - Gamma(n; A). Show that for all z 2 0 (1.91) 1.18 numbers, XI, . . . , X,, from the interval [0,1]. Consider the random experiment where we draw uniformly and independently n a) Let A4 be the smallest of the n numbers. Express A4 in terms of XI, . . . , X,. b) Determine the pdf of M. 1.19 Let Y = ex, where X - N(0,l). a) Determine the pdf of Y. b) Determine the expected value of Y. 1.20 We select apoint (X, Y) from the triangle (0,O) - (1,O) - (1,l) in such a way that X has a uniform distribution on (0,l) and the conditional distribution of Y given X = x is uniform on (0, x). a) Determine the joint pdf of X and Y. b) Determine the pdf of Y. c) Determine the conditional pdf of X given Y = y for all y E (0,l). d) Calculate E[X I Y = y] for all y E (0,l). e) Determine the expectations of X and Y. Poisson Processes 1.21 Let {Nt, t 2 0) be a Poisson process with rate A = 2. Find a) P(N2 = 1, N3 = 4, N:, = 5), b) P(N4 = 3 I N2 = 1, N3 = 2), d) P(N[2,7] = 4, N[3,8] = 6), e) E[N[4,6] IN[1,5] = 31. Show that for any fixed k E N, t > 0 and A > 0, c) E“4 I N2 = 21, 1.22 (Hint: write out the binomial coefficient and use the fact that limn+m (1 - $)n = eCXt.) 1.23 Consider the Bernoulli approximation in Section 1.1 1. Let U1, U2, . . . denote the times of success for the Bernoulli process X. a) Verify that the “intersuccess” times U1, U2 - Ul, . . . are independent and have a b) For small h and n = Lt/hJ, show that the relationship P(A1 > t) =: P(U1 > n) geometric distribution with parameter p = Ah. leads in the limit, as n -+ 00, to B(A1 > t) = e-Xt 44 PRELIMINARIES 1.24 j=O,1,2 , , n, If {N,,t 2 0) is a Poisson process with rate A, show that for 0 < u. < t and P(NTL = j I Nt = n) = (;) (f)j (1 - y, that is, the conditional distribution of Nu given Nt = n is binomial with parameters n and Markov Processes 1.25 Example 1.10. Also, calculate E[X,] and the variance of X, for each n. 1.26 Ult. Determine the (discrete) pdf of each X,, n = 0, 1,2, . . . for the random walk in Let {X,, n E N} be a Markov chain with state space {0,1,2}, transition matrix 0.3 0.1 0.6 0.1 0.7 0.2 and initial distribution 7r = (0.2,0.5,0.3). Determine P = ( 0.4 0.4 0.2 ) , a) P(X1 = 2), c) P(X3 = 2 I xo = O), b) P(X2 = 2), d) P(X0 = 1 I Xi = 2), e) P(X1 = l,X3 = 1). 1.27 Consider two dogs harboring a total number of m fleas. Spot initially has b fleas and Lassie has the remaining m - b. The fleas have agreed on the following immigration policy: at every time n = 1,2. . . a flea is selected at random from the total population and that flea will jump from one dog to the other. Describe the flea population on Spot as a Markov chain and find its stationary distribution. 1.28 Classify the states of the Markov chain with the following transition matrix: 0.0 0.3 0.6 0.0 0.1 0.0 0.3 0.0 0.7 0.0 0.0 0.1 0.0 0.9 0.0 0.1 0.1 0.2 0.0 0.6 1.29 Consider the following snakes-and-ladders game. Let N be the number of tosses required to reach the finish using a fair die. Calculate the expectation of N using a computer. start finish PROBLEMS 45 1.30 Ms. Ella Brum walks back and forth between her home and her office every day. She owns three umbrellas, which are distributed over two umbrella stands (one at home and one at work). When it is not raining, Ms. Brum walks without an umbrella. When it is raining, she takes one umbrella from the stand at the place of her departure, provided there is one available. Suppose the probability that it is raining at the time of any departure is p. Let X, denote the number of umbrellas available at the place where Ella arrives after walk number n; n = 1,2, . . ., including the one that she possibly brings with her. Calculate the limiting probability that it rains and no umbrella is available. 1.31 A mouse is let loose in the maze of Figure 1.9. From each compartment the mouse chooses one of the adjacent compartments with equal probability, independent of the past. The mouse spends an exponentially distributed amount of time in each compartment. The mean time spent in each of the compartments 1, 3, and 4 is two seconds; the mean time spent in compartments 2,5, and 6 is four seconds. Let {Xt, t 3 0) be the Markov jump process that describes the position of the mouse for times t 2 0. Assume that the mouse starts in compartment 1 at time t = 0. Figure 1.9 A maze. What are the probabilities that the mouse will be found in each of the compartments 1,2, . . . ,6 at some time t far away in the future? 1.32 In an M/M/m-queueing system, customers arrive according to a Poisson process with rate a. Every customer who enters is immediately served by one of an infinite number of servers; hence, there is no queue. The service times are exponentially distributed, with mean l/b. All service and interarrival times are independent. Let Xt be the number of customers in the system at time t. Show that the limiting distribution of Xt, as t t 00, is Poisson with parameter a/b. Optimization 1.33 1.34 that V, f xTAx = Ax. What is the gradient if A is not symmetric? 1.35 distribution. 1.36 Derive the program (1.78). Let a and let x be n-dimensional column vectors. Show that V, aTx = a. Let A be a symmetric n x n matrix and x be an n-dimensional column vector. Show Show that the optimal distribution p* in Example 1.17 is given by the uniform 46 PRELIMINARIES 1.37 Consider the MinxEnt program n n subject to: p 2 0, Ap = b, cpi = 1 , i=l where p and q are probability distribution vectors and A is an m x n matrix. a) Show that the Lagrangian for this problem is of the form b) Show that p, = qi exp( -0 - 1 + pi + C,”=, Xj aji), for i = 1, . . . , n. c) Explain why, as a result of the KKT conditions, the optimal p* must be equal to d) Show that the solution to this MinxEnt program is exactly the same as for the the zero vector. program where the nonnegativity constraints are omitted. Further Reading An easy introduction to probability theory with many examples is [ 141, and a more detailed textbookis [9]. A classical reference is [7]. An accurate and accessible treatment of various stochastic processes is given in [4]. For convex optimization we refer to [3] and [8]. REFERENCES 1. S. Asmussen and R. Y. Rubinstein. Complexity properties of steady-state rare-events simulation in queueing models. In J. H. Dshalalow, editor, Advances in Queueing: Theory, Methods and Open Problems, pages 429-462, New York, 1995. CRC Press. 2. 2. I. Botev, D. P. Kroese, and T. Taimre. Generalized cross-entropy methods for rare-event simulation and optimization. Simulation; Transactions of the Society for Modeling and Simulation International, 2007. In press. 3. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004. 4. E. Cinlar. Introduction to Stochastic Processes. Prentice Hall, Englewood Cliffs, NJ, 1975. 5. T. M. Cover and J. A. Thomas. Elements oflnformation Theory. John Wiley & Sons, New York, 6. C. W. Curtis. Linear Algebra: An Introductory Approach. Springer-Verlag, New York, 1984. 7. W. Feller. An Introduction to Probability Theory and Its Applications, volume I. John Wiley & 8. R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, New York, 1987. 9. G. R. Grimmett and D. R. Stirzaker. Probability and Random Processes. Oxford University 10. J. N. Kapur and H. K. Kesavan. Entropy Optimization Principles with Applications. Academic 1991. Sons, New York, 2nd edition, 1970. Press, Oxford, 3rd edition, 2001. Press, New York, 1992. REFERENCES 47 11. A. I. Khinchin. Information Theory. Dover Publications, New York, 1957. 12. V. Kriman and R. Y. Rurbinstein. Polynomial time algorithms for estimation of rare events in queueing models. In J. Dshalalow, editor, Fronfiers in Queueing: Models and Applications in Science and Engineering, pages 421-448, New York, 1995. CRC Press. 13. E. L. Lehmann. Tesfing Sfatistical Hypotheses. Springer-Verlag, New York, 1997. 14. S. M. Ross. A First Course in Probability. Prentice Hall, Englewood Cliffs, NJ, 7th edition, 15. R. Y. Rubinstein and B. Melamed. Modern Simulation and Modeling. John Wiley & Sons, New 2005. York, 1998. This Page Intentionally Left Blank CHAPTER 2 RANDOM NUMBER, RANDOM VARIABLE, AND STOCHASTIC PROCESS G E N E RAT1 ON 2.1 INTRODUCTION This chapter deals with the computer generation of random numbers, random variables, and stochastic processes. In a typical stochastic simulation, randomness is introduced into simulation models via independent uniformly distributed random variables. These random variables are then used as building blocks to simulate more general stochastic systems. The rest of this chapter is organized as follows. We start, in Section 2.2, with the generation of uniform random variables. Section 2.3 discusses general methods for generating one-dimensional random variables. Section 2.4 presents specific algorithms for generating variables from commonly used continuous and discrete distributions. In Section 2.5 we discuss the generation of random vectors. Sections 2.6 and 2.7 treat the generation of Pois- son processes, Markov chains and Markov jump processes. Finally, Section 2.8 deals with the generation of random permutations. 2.2 RANDOM NUMBER GENERATION In the early days of simulation, randomness was generated by manual techniques, such as coin flipping, dice rolling, card shuffling, and roulette spinning. Later on, physical devices, such as noise diodes and Geiger counters, were attached to computers for the same purpose. The prevailing belief held that only mechanical or electronic devices could produce truly random sequences. Although mechanical devices are still widely used in gambling Simulation and the Monte Carlo Method. Second Edition. By R.Y. Rubinstein and D. P. Kroese 49 Copyright @ 2007 John Wiley & Sons, Inc. [...]... Method) 1 Generate the random variable Y according to P ( Y = 2) = p i , 2 = 1, , rn 2 Given Y = i, generate X from the cdf Gi 2 .3. 4 Acceptance-Rejection Method The inverse-transform and composition methods are direct methods in the sense that they deal directly with the cdf of the random variable to be generated The acceptance-rejection method, is an indirect method due to Stan Ulam and John von Neumann... method is provided by the following theorem Theorem 2 .3. 1 The random variable generated according to Algorithm 2 .3. 5 has the desiredpdf f (z) ProoJ Define the following two subsets: d={(x,y):OIyiCg(z)} and B = { ( x l y ) : O 5 y I f ( z ) } , (2.12) which represent the areas below the curves Cg(z) and f(z), respectively Note first that Steps 1 and 2 of Algorithm 2 .3. 5 imply that the random vector ( X ,Y... Uniform Random Variables in Matlab This example illustrates the use of the rand function in Matlab to generate samples from the U(0,l) distribution For clarity we have omitted the “ans = ” output in the Matlab session below >> rand % g e n e r a t e a uniform random number 0.0196 >> rand 0.8 23 >> r a n d ( l , 4 ) 0.5252 0.2026 rand(’state’,1 234 ) >> rand 0.6104 r a n d ( ’ s t a t e ’ ,1 234 ) >> rand 0.6104... generation method for N(0,l) is based on the acceptance-rejection method First, note that in order to generate a random variable Y from N(0, l ) , one can first generate a positive random variable X from the pdf (2.20) and then assign to X a random sign The validity of this procedure follows from the symmetry of the standard normal distribution about zero 60 RANDOM NUMBER RANDOM VARIABLE, AND STOCHASTIC... of the recursive formula Xi+l = ax, + c (mod m ) , (2.1) where the initial value, X O ,is called the seed and the a , c, and m (all positive integers) are called the multiplier, the increment and the modulus, respectively Note that applying the modulo-m operator in (2.1) means that a x i c is divided by m, and the remainder is taken as the value for Xi+l Thus, each X can only assume a value from the. .. and the quantities X -2 1 m called pseudorandom numbers, constitute approximations to a true sequence of uniform random variables Note that the sequence X O ,X I ,X 2 , will repeat itself after at most rn steps and will therefore be periodic, with period not exceeding m For example, let a = c = X = 3 and m = 5 Then the sequence obtained from the recursive formula o X,+l = 3 X , + 3 (mod 5) is 3. .. RANDOM NUMBER, RANDOM VARIABLE AND STOCHASTIC PROCESS GENERATION and lotteries, these methods were abandoned by the computer -simulation community for several reasons: (a) Mechanical methods were too slow for general use, (b) the generated sequences cannot be reproduced and, (c) it has been found that the generated numbers exhibit both bias and dependence Although certain modem physical generation methods... 1 That is (2 .32 ) where the { Y t } are independent and Exp(A) distributed Since Yj = - In U j , with U, U(0, l ) ,we can rewrite (2 .32 ) as - (2 .33 ) This leads to the following algorithm Algorithm 2.4.10 (Generating a Poisson Random Variable) 1 S e t n = 1 anda = 1 2 Generate U, - U(0,l) andset a = a U, 3 I f a 2 e-’, then set ‘n= n + 1 andgo to Step 2 4 Otherwise, return X = n - 1 as a random variablefrom... 2 Even in the case where F-' exists in an explicit form, the inversetransform method may not necessarily be the most efficient random variable generation method (see [2]) 2 .3. 2 Alias Method An alternative to the inverse-transform method for generating discrete random variables, which does not require time-consuming search techniques as per Step 2 of Algorithm 2 .3. 2, is the so-called alias method [l... initial setup and extra storage for the n - 1 pdfs, q ( k ) A procedure for computing these two-point pdfs can be found in [2] Once the representation (2.10) has been established, generation from f is simple and can be written as follows: Algorithm 2 .3. 3 (Alias Method) 1 Generate U - U ( 0 , l ) andset K = 1 + l(n - 1)U J 2 Generate X f r o m the two-pointpdfq(K) 2 .3. 3 Composition Method This method assumes . 3. IfY < f (X), return 2 = X. Otherwise, return to Step 1. The theoretical basis of the acceptance-rejection method is provided by the following theorem. Theorem 2 .3. 1 The random. inverse-transform and composition methods are direct methods in the sense that they deal directly with the cdf of the random variable to be generated. The acceptance-rejection method, is an indirect method. 0.5252 0.2026 rand(’state’,1 234 ) >> rand rand(’state’ ,1 234 ) >> rand 0.6104 0.6104 % generate a uniform random number 3 generate another uniform random number % generate