SEQUENTIAL MONTE CARLO METHODS FOR PROBLEMS ON FINITE STATE SPACES

SEQUENTIAL MONTE CARLO METHODS FOR PROBLEMS ON FINITE STATE-SPACES WANG JUNSHAN NATIONAL UNIVERSITY OF SINGAPORE 2015 SEQUENTIAL MONTE CARLO METHODS FOR PROBLEMS ON FINITE STATE-SPACES WANG JUNSHAN (Bachelor of Science, Wuhan University, China) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY NATIONAL UNIVERSITY OF SINGAPORE 2015 Declaration I hereby declare that the thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. WANG JUNSHAN July 6, 2015 Thesis Supervisor Ajay Jasra Associate Professor; Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore. David Nott Associate Professor; Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore. This thesis is dedicated to My beloved daughter Eva Acknowledgements I would like to express my deepest gratitude to Professor Ajay Jasra and Professor David Nott. I feel very fortunate to have them as my supervisors. I would like to thank them for their generous donation of time, enlightening ideas and valuable advices. Especially, I would like to thank Professor Ajay Jasra for his constant patient, guidance, encouragement and support. I wish to give my sincere thanks to Professor Chan Hock Peng, Professor Pierre Del Moral and Dr Alexandre Thiery for their constructive suggestions and critical ideas about my thesis. I would like to thank Professor Chan Hock Peng again and Professor Sanjay Chaudhuri for their advices and help on my oral QE. I am also thankful to the NUS for providing such a wonderful academic and social platform, and the research scholarship. Additionally, I would like to thank people in DSAP for their effort and support on the graduate programme. I also feel grateful to my friends for their accompany, advices and encouragement. Last but not least, I would like to show my greatest appreciation to my family members. Their endless love and support are the strength of my life. Contents Summary v List of Figures vi List of Tables xi List of Publications xiv Chapter Introduction 1.1 The Sequential Monte Carlo Method . . . . . . . . . . . . . . . . . . . 1.2 Problems of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Contributions of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Literature Review 2.1 Sequential Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . 2.1.1 Notations and Objectives . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Standard Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.4 Sequential Importance Sampling . . . . . . . . . . . . . . . . . . 16 2.1.5 Resampling Techniques . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.6 Sequential Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.7 Discrete Particle Filter . . . . . . . . . . . . . . . . . . . . . . . 25 i Contents 2.2 Markov Chain Monte Carlo Methods . . . . . . . . . . . . . . . . . . . 29 2.3 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4 Combinations of SMC and MCMC . . . . . . . . . . . . . . . . . . . . 35 2.4.1 SMC Samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.4.2 Particle MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.5 Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.6 The Permanent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.7 The Alpha-permanent . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter Network Models 47 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2 Likelihood Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.3 Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.1 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . 55 3.3.2 Sequential Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . 57 3.3.3 Discrete Particle Filter . . . . . . . . . . . . . . . . . . . . . . . 60 Simulation Results: Likelihood Estimation . . . . . . . . . . . . . . . . 63 3.4.1 IS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.4.2 SMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.3 DPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.4.4 Relative Variance . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.4.5 CPU Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.5.1 Particle Markov Chain Monte Carlo . . . . . . . . . . . . . . . . 76 Simulation Results: Parameter Estimation . . . . . . . . . . . . . . . . 78 3.6.1 Process of drawing samples . . . . . . . . . . . . . . . . . . . . 79 3.6.2 Analysis of samples . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.4 3.5 3.6 ii Contents 3.7 3.8 Large Data Analysis: Likelihood and Parameter Estimation . . . . . . 83 3.7.1 Likelihood Approximation . . . . . . . . . . . . . . . . . . . . . 84 3.7.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . 87 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter Permanent 92 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2 Computational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2.1 Basic Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2.2 Simulated Annealing Algorithm . . . . . . . . . . . . . . . . . . 96 4.2.3 New Adaptive SMC Algorithm . . . . . . . . . . . . . . . . . . 98 4.2.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . 101 4.2.5 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 103 4.3 4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.1 Toy Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.2 A Larger Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Chapter α-Permanent 112 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.2 Computational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.3 5.4 5.2.1 SMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.2.2 DPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.3.1 SMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.3.2 DPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Bayesian Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 iii Contents 5.5 5.4.1 Marginal Density of the boson Process . . . . . . . . . . . . . . 145 5.4.2 Pseudo Marginal MCMC . . . . . . . . . . . . . . . . . . . . . . 147 5.4.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 147 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Chapter Summary and Future Work 155 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 References 158 Appendices 171 Appendix A Relative Variance and Rejection Sampling Method 171 A.1 Relative Variance Result in Section 3.3.2 . . . . . . . . . . . . . . . . . 171 A.2 Rejection Sampling Method Used in Section 3.6 . . . . . . . . . . . . . 175 Appendix B Theorem Proofs and Technical Results 176 B.1 Proof of Theorem 4.2.1 in Section 4.2.4 . . . . . . . . . . . . . . . . . . 176 B.2 Technical Results Prepared for Theorem 4.2.2 . . . . . . . . . . . . . . 182 B.3 Proof of Theorem 4.2.2 in Section 4.2.5 Appendix C Matrices . . . . . . . . . . . . . . . . . 185 187 C.1 Matrices in Section 5.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 187 C.1.1 A1 -A4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Tr C.1.2 K100 and K100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 iv Appendix A. Relative Variance and Rejection Sampling Method So, now consider for any x ∈ Vt−k Mk+1 (ϕ)(x) = Mk+1 (x, u)ϕ(u) u∈Vt−k−1:t = Mk+1 (x, u)I{x}×Wk+1 (x) (u)ϕ(u) u∈Vt−k−1:t ≤ ξk+1 (t − t0 )Mk+1 (y, v)I{y}×Wk+1 (y) (v) I{x}×Wk+1 (x) (u)ϕ(u). u∈Vt−k−1:t Then on multiplying both sides of the inequality by ϕ(v) and summing w.r.t. v, we have ϕ(v) ≤ ξk+1 (t − t0 ) Mk+1 (ϕ)(x) v∈Vt−k−1:t Mk+1 (y, v)I{y}×Wk+1 (y) (v) × v∈Vt−k−1:t I{x}×Wk+1 (x) (u)ϕ(u) ϕ(v) u∈Vt−k−1:t ≤ ξk+1 (t − t0 )Mk+1 (ϕ)(y) ϕ(u). u∈Vt−k−1:t Hence we shown that for any (x, y) ∈ Vt−k:t Mk+1 (ϕ)(x) ≤ ξk+1 (t − t0 ). Mk+1 (ϕ)(y) On noting (A.1.2), (A.1.1) and the above arguments, the proof is completed. 174 A.2 Rejection Sampling Method Used in Section 3.6 A.2 Rejection Sampling Method Used in Section 3.6 In order to conduct the rejection sampling from the posterior density q(p|Gt ) ∝ Lp (Gt ), we need to adopt a proposal p.d.f g(p) and an constant K such that g(p)/Lp (Gt ) ≥ K for all p ∈ (0, 1). After some experiments, we use g(p) = ϕ(p; 0.4, 0.2) and K = 2.723 × 106 , where ϕ(p; 0.4, 0.2) is normal p.d.f with mean 0.4 and variance 0.2. Then the rejection sampling algorithm is given below. Algorithm A.1 Rejection sampling algorithm (1) sample p from N (0.4, 0.2) until p ∈ (0, 1). (2) sample u from U (0, 1). (3) reject p if u > Lp (Gt ) × K/g(p )and return to step 1, otherwise, keep p as an element of the target random sample and return to step 1. 175 ❆♣♣❡♥❞✐① ❇ ❚❤❡♦r❡♠ Pr♦♦❢s ❛♥❞ ❚❡❝❤♥✐❝❛❧ ❘❡s✉❧ts B.1 Proof of Theorem 4.2.1 in Section 4.2.4 We will use the Feynman-Kac notations established in Section 4.2.5 and the reader should be familiar with that Section to proceed. Recall, from Section 4.2.3, for ≤ p≤r−1 ΦN p+1 (M ) Gp,N (M ) = N Φp (M ) and recall that ΦN (M ) is deterministic and known. In addition, for (u, v) ∈ U × V wpN (u, v) = N δ + ηp−1 (IM φφp+1 ) p N δ + [ηp−1 (IN (u,v) φφp+1 )] wN (u,v) p p 176 B.1 Proof of Theorem 4.2.1 in Section 4.2.4 where for ϕ ∈ Bb (M), ≤ p ≤ r ηpN (ϕ) N = N ϕ(Mpi ) i=1 is the SMC approximation of ηp (recall that one will resample at every time-point, in this analysis). By a simple inductive argument, it follows that one can find a < c(n) < ∞ such that for any ≤ p ≤ r, N ≥ 1, (u, v) ∈ U × V δ+1 . δ c(n) ≤ wpN (u, v) ≤ Using the above formulation, for any N ≥ sup |Gp,N (M )| ≤ ∨ M ∈M δ+1 δc(n) (B.1.1) which will be used later. Note that r−1 γrN (1) r−1 ηpN (Gp,N ) = p=0 = p=0 N N Gp,N (Mpi ) . i=1 With Gp−1,N , given Qp,N (M, M ) = Gp−1,N (M )Kp,N (M, M ) (Kp,N is the MCMC kernel in [48] with invariant measure proportional to ΦN p ) and and Gp−1 , Qp denote the limiting versions (that is, on replacing ηpN with ηp and so-fourth). Recall the definition of γt (1) in (4.2.9), which uses the limiting versions of Gp−1 and Kp . 177 Appendix B. Theorem Proofs and Technical Results Proof of Theorem 4.2.1. We start with the following decomposition r−1 γrN (1) r−1 ηpN (Gp,N ) − γr (1) = − p=0 r−1 ηpN (Gp ) p=0 where one can show that γr (1) = r−1 p=0 r−1 ηpN (Gp ) + − p=0 ηp (Gp ) p=0 ηp (Gp ); see [21]. By Theorem B.1.1, the second term on the R.H.S. goes to zero. Hence we will focus on r−1 p=0 ηpN (Gp,N )− r−1 p=0 ηpN (Gp ). We have the following collapsing sum representation r−1 r−1 ηpN (Gp,N )− p=0 r−1 q−1 ηpN (Gp ) = p=0 q=0 where we are using the convention By Theorem B.1.1, q−1 s=0 r−1 ηsN (Gs ) ηqN (Gq,N )−ηqN (Gq ) s=0 ∅ ηsN (Gs,N ) s=q+1 = 1. We can consider each summand separately. ηsN (Gs ) will converge in probability to constant. By the proof of Theorem B.1.1 (see (B.1.5)) ηqN (Gq,N ) − ηqN (Gq ) converges to zero in probability and r−1 s=q+1 ηsN (Gs,N ) converges in probability to a constant; this completes the proof of the theorem. E will be used to denote expectation w.r.t. the probability associated to the SMC algorithm. Theorem B.1.1. For any ≤ p ≤ r − 1, (ϕ0 , . . . , ϕp ) ∈ Bb (M)p+1 and ((u1 , v1 ), . . . , (up+1 , vp+1 )) ∈ (U × V )p+1 , we have N (η0N (ϕ0 ), w1N (u1 , v1 ), . . . , ηpN (ϕp ), wp+1 (up+1 , vp+1 )) →P 178 B.1 Proof of Theorem 4.2.1 in Section 4.2.4 ∗ (η0 (ϕ0 ), w1∗ (u1 , v1 ), . . . , ηp (ϕp ), wp+1 (up+1 , vp+1 )). Proof. Our proof proceeds via strong induction. For p = 0, by the WLLN for i.i.d. random variables η0N (ϕ0 ) →P η0 (ϕ0 ). Then by the continuous mapping theorem, it clearly follows that for any fixed (u1 , v1 ) that w1N (u1 , v1 ) →P w1∗ (u1 , v1 ) and indeed that M0 ∈ M, G0,N (M0 ) →P G0 (M0 ) which will be used later on. Thus, the proof of the initialization follows easily. Now assume the result for p − and consider the proof at rank p. We have that ηpN (ϕp ) − ηp (ϕp ) = ηpN (ϕp ) − E[ηpN (ϕp )|Fp−1 ] + E[ηpN (ϕp )|Fp−1 ] − ηp (ϕp ) (B.1.2) where Fp−1 is the filtration generated by the particle system up-to time p − 1. We focus on the second term on the R.H.S., which can be written as: E[ηpN (ϕp )|Fp−1 ] N ηp−1 (Qp (ϕp )) ηp−1 (Qp (ϕp )) − − ηp (ϕp ) = ηp−1 (Gp−1 ) ηp−1 (Gp−1 ) 1 N + ηp−1 (Qp (ϕp )) N − ηp−1 (Gp−1,N ) ηp−1 (Gp−1 ) N ηp−1 [{Qp,N − Qp }(ϕp )] + . ηp−1 (Gp−1,N ) (B.1.3) By the induction hypothesis, as Qp (ϕp ) ∈ Bb (M), the first term on the R.H.S. of (B.1.3) converges in probability to zero. To proceed, we will consider the two terms on the R.H.S. of (B.1.3) in turn, starting with the second. 179 Appendix B. Theorem Proofs and Technical Results Second Term on R.H.S. of (B.1.3). Consider N N E[|ηp−1 (Gp−1,N ) − ηp−1 (Gp−1 )|] = E[|ηp−1 (Gp−1,N − Gp−1 )|] N (Gp−1 ) − ηp−1 (Gp−1 )|] + E[|ηp−1 N ≤ E[|ηp−1 (Gp−1,N − Gp−1 )|] N + E[|ηp−1 (Gp−1 ) − ηp−1 (Gp−1 )|]. For the second term of the R.H.S. of the inequality, by the induction hypothesis N N (Gp−1 ) − (Gp−1 ) − ηp−1 (Gp−1 )| →P and as Gp−1 is a bounded function, so E[|ηp−1 |ηp−1 ηp−1 (Gp−1 )|] will converge to zero. For the first term, we have N 1 E[|ηp−1 (Gp−1,N − Gp−1 )|] ≤ E[|Gp−1,N (Mp−1 ) − Gp−1 (Mp−1 )|] where we have used the exchangeability of the particle system (the marginal law of any i sample Mp−1 is the same for each i ∈ [N ]). Then, noting that the inductive hypothesis implies that for any fixed Mp−1 ∈ M Gp−1,N (Mp−1 ) →P Gp−1 (Mp−1 ) (B.1.4) N by essentially the above the arguments (note (B.1.1)), we have that E[|ηp−1 (Gp−1,N − Gp−1 )|] → 0. This establishes N ηp−1 (Gp−1,N ) →P ηp−1 (Gp−1 ). 180 (B.1.5) B.1 Proof of Theorem 4.2.1 in Section 4.2.4 N Thus, using the induction hypothesis, as Qp (ϕp ) ∈ Bb (M), ηp−1 (Qp (ϕp )) converges in probability to a constant. This fact combined with above argument and the continuous mapping Theorem, shows that the the second term on the R.H.S. of (B.1.3) will converge to zero in probability. Third Term on R.H.S. of (B.1.3). We would like to show that N 1 E||ηp−1 [{Qp,N − Qp }(ϕp )]|] ≤ E[|Qp,N (ϕp )(Mp−1 ) − Qp (ϕp )(Mp−1 )|]. goes to zero. As the term in the expectation on the R.H.S. of the inequality is bounded (note (B.1.1)), it suffices to prove that this term will converge to zero in probability. We have, for any fixed M ∈ M Qp,N (ϕp )(M ) − Qp (ϕp )(M ) = [Gp−1,N (M ) − Gp−1 (M )]Kp,N (ϕp )(M ) + Gp−1 (M )[Kp,N (ϕp )(M ) − Kp (ϕp )(M )]. As Kp,N (ϕp )(M ) is bounded, it clearly follows via the induction hypothesis (note (B.1.4)) that [Gp−1,N (M )−Gp−1 (M )]Kp,N )(ϕp )(M ) will converge to zero in probability. To deal with the second part, we consider only ‘acceptance’ part of the M-H kernel; dealing with the ‘rejection’ part is very similar and omitted for brevity: qp (M, M )ϕp (M ) ∧ M ∈M ΦN Φp (M ) p (M ) − ∧ ΦN Φp (M ) p (M ) where qp (M, M ) is the symmetric proposal probability. 181 (B.1.6) For any fixed M, M Appendix B. Theorem Proofs and Technical Results 1∧ ΦN p (M ) ΦN p (M ) N (·), wpN (when they appear), so by the is a continuous function of ηp−1 induction hypothesis, it follows that for any M, M ∈ M, ΦN Φp (M ) p (M ) 1∧ −1∧ N Φp (M ) Φp (M ) →P and hence so does (B.1.6) (recall M is finite). By (B.1.5) ηp−1 (Gp−1,N ) converges in probability to ηp−1 (Gp−1 ) and hence third term on the R.H.S. of (B.1.3) will converge to zero in probability. Now, following the proof of [5, Theorem 3.1] and the above arguments, the first term on the R.H.S.of (B.1.2) will converge to zero in probability. Thus, we have shown that ηpN (ϕp ) − ηp (ϕp ) will converge to zero in probability. Then, by this latter result and the induction hypothesis, along with the continuous mapping theorem, it follows ∗ N (up+1 , vp+1 ) and (up+1 , vp+1 ) →P wp+1 that for (up+1 , vp+1 ) ∈ (U × V ) arbitrary, wp+1 indeed that Gp,N (Mp ) converges in probability to Gp (Mp ) for any fixed Mp ∈ M. From here one can conclude the proof with standard results in probability. B.2 Technical Results Prepared for Theorem 4.2.2 The following technical results will allow us to give the main result associated to the complexity of the SMC algorithm in Section 4.2.5.2. 182 B.2 Technical Results Prepared for Theorem 4.2.2 Lemma B.2.1. Assume (A1). Then for any n > 1, ≤ p ≤ r − 1, |Gp (M )| 8(n2 + 1) ≤ . λp n2 M ∈M sup Proof. Let ≤ p ≤ r − be arbitrary. We note that by [6, Corollary 4.4.2] sup |Gp (M )| ≤ √ 2. (B.2.1) M ∈M Thus, we will focus upon λp . We start our calculations by noting: φp (M ) + Zp = M ∈M ≤ φp (M )wp (u, v) (u,v)∈U ×V M ∈N (u,v) φp (M )wp∗ (u, v) φp (M ) + M ∈M (u,v)∈U ×V M ∈N (u,v) = 2Ξp (M)(n + 1) (B.2.2) where we have applied (A1) to go to line 2. Now, moving onto λp : λp ≥ ηp (M )Gp (M ) (u,v)∈U ×V M ∈N (u,v) = (u,v)∈U ×V M ∈N (u,v) ≥ ≥ 2Zp φp (M )wp (u, v) φp+1 (M )wp+1 (u, v) Zp φp (M )wp (u, v) ∗ φp+1 (M )wp+1 (u, v) (u,v)∈U ×V M ∈N (u,v) 4Ξp (M)(n2 + 1) (u,v)∈U ×V M ∈N (u,v) 183 φp+1 (M )Ξp+1 (M) Ξp+1 (N (u, v)) Appendix B. Theorem Proofs and Technical Results Ξp+1 (M) n2 4Ξp (M)(n2 + 1) n2 ≥ √ 2(n2 + 1) = (B.2.3) where we have used (B.2.2) to go to the fourth line, the fact that M ∈N (u,v) φp+1 (M ) = Ξp+1 (N (u, v)) to go to the fifth line, and the inequality [6, (4.14)] to go to the final line. Thus, noting (B.2.1) and (B.2.3) we have shown that 8(n2 + 1) |Gp (M )| ≤ λp n2 M ∈M sup which completes the proof. We now write the Ls (ηp ) norm, s ≥ 1, for f ∈ Bb (M) f Ls (ηp ) |f (M )|s ηp (M ) := M ∈M Let 8(n2 + 1) τ (n) = n2 ρ(n) = (1 − 1/(Cn2 ))2 . Then, we have the following result. 184 1/s . B.3 Proof of Theorem 4.2.2 in Section 4.2.5 Lemma B.2.2. Assume (A1). Then if τ (n)3 (1 − ρ(n)) < we have that for any f ∈ Bb (M), ≤ p < t ≤ r: Qp,t (f ) t−1 q=p λq ≤ L4 (ηp ) τ (n)3/4 f − (1 − ρ(n))τ (n)3 L4 (ηt ) . Proof. The proof follows by using the technical results in [76]. In particular, Lemma B.2.1 will establish Assumption B in [76] and (4.2.10) Assumption D and hence Assumption C of [76]. Application of Corollary 5.3 (r = 2) of [76], followed by Lemma 4.8 of [76] completes the proof. Remark B.2.1. The condition τ (n)3 (1 − ρ(n)) < is not restrictive and will hold for n moderate; both τ (n) and ρ(n) are O(1) which means that τ (n)3 (1 − ρ(n)) < for n large enough. B.3 Proof of Theorem 4.2.2 in Section 4.2.5 Some relevant lemmas and their proofs can be found in Appendix B.2. Proof. Lemma B.2.2, combined with [76, Lemma 4.1] show that Assumption A of [76] ¯ holds, with cp,t (p) (of that paper) equal to C(n); that is for ≤ p < t ≤ r, f ∈ Bb (M): max Qp,t (f ) t−1 q=p λq , L4 (ηp ) Qp,t (f ) t−1 q=p λq , L4 (ηp ) Qp,t (f ) t−1 q=p λq ¯ ≤ C(n) f L4 (ηt ) . L4 (ηp ) (B.3.1) 185 Appendix B. Theorem Proofs and Technical Results Then, one can apply [76, Theorem 3.2], if N > 2ˆ cr ; E γrN (1) −1 γr (1) ≤ N r Qp,t (1) + cˆr vr t−1 N q=p λq Varηp p=0 (B.3.2) where cˆr , vr are defined in [76] and Varηp [·] is the variance w.r.t. the probability ηp . By (B.3.1) and Jensen’s inequality r Varηp p=0 Qp,t (1) ≤ t−1 q=p λq r p=0 Qp,t (1) t−1 q=p λq r ≤ L2 (ηp ) p=0 Qp,t (1) t−1 q=p λq ¯ ≤ (r + 1)C(n) L4 (ηp ) From the definitions in [76], one can easily conclude that: ¯ ¯ 2) cˆr ≤ C(n)(r + 1)(3 + C(n) ¯ 2. vr ≤ (r + 1)C(n) ¯ Combining the above arguments with (B.3.2), gives that for N > 2C(n)(r + 1)(3 + ¯ 2) C(n) E γrN (1) −1 γr (1) ≤ ¯ ¯ ¯ 2) (r + 1)C(n) 2(r + 1)C(n)(3 + C(n) 1+ N N which concludes the proof. 186 ❆♣♣❡♥❞✐① ❈ ▼❛tr✐❝❡s C.1 C.1.1 Matrices in Section 5.3.1 A1 -A4  A1 =  7 9 10  2 10 10 2 7 5 5  10 8 1 2 6  9 9 10 2  2 10 3 10 2    3 7 8 10 4  8 8 3 9 6    9 10 9  3 9 6 1  6 10 8 4    6 8 7 10 10 10 5 7  10 10 3    6 10 10 10 7  10 5 10 3  7 3 10 4 8   10 10 10 6 3 9 4  9 10 8 7 7 9    6 10 10 7 6 10 9 4 9 4 8 10 10 187 Appendix C. Matrices  A2 =  10 10 9  0 10 10 9 6   9 1 1 3 10  9 6 7 9 5  7 9 10     8 8 10 10  10 2 10 10 10    4 3 5 9 4 6  7 10 7   10 10 6 10 10    7 9 8 3 4  10 10 6 9    6 8 9 9  9 1 8 10  10 3 1 10 1    4 8 9 9 8 10 10 8 7 5 6     10 10 5 9  5 0 10 4 10 8 3 3  A3 = 2 9 1 8  10 2 2  3 5 4 1  4 6 10 7   10 7   10 3 8 10 8 8  A4 =  10 1 10 8 3 6 4 7 8 10 2  10 6 10 1 4 9 10 3 10 7 2  10 7 8 3  10 6 5 10 10 1 10 10 3 10 5 10 1  5 10 7 10 10 1 10 4 10 5  10 0 1 5 4 2 7 9  10 8 3  7 8 7 8 9  10 0 10   10 9 7 6  10 10 10  9 7 8 3 8   10 6 6 10 9 10 10 7 10 188 8 3  8 8 5  5 6 2  9 4  9 1 1  8 C.1 Matrices in Section 5.3.1 C.1.2 Tr K100 and K100 From [55], the (i, j) entry of matrix K100 is exp{−(xi − xj )2 }, where x1 , . . . , x100 are independently drawn from the symmetric triangular distribution on (−π, π). Then Tr K100 is a tri-diagonal matrix truncated from K100 by setting the (i, j) entry of K100 equal to whenever |i − j| > 1. The sampled vector (x1 , . . . , x100 ) is: −0.705216786 −0.217187115 1.827290178 −0.702594381 −0.837418349 0.393279146 1.092698524 0.144798419 0.506916062 0.856018348 0.257751279 0.65856651 0.00453435 −0.651549109 −1.193808998 0.0479954 0.112651813 −0.132829941 −1.853965333 2.099361954 −1.908164154 0.753179292 −2.736225028 −0.479254808 −1.03381552 −0.699208745 0.112115809 1.313292939 −2.132601918 2.565699698 2.862055283 1.040449216 −0.152870603 −0.254681533 −0.047392154 −0.586870111 −0.795718454 −0.673418741 −0.51447706 −0.553316009 −1.518833671 0.450001581 −0.374937715 −0.198835311 1.959340097 1.427657406 −0.060101521 −0.472128036 1.179699592 −0.040150594 −0.869233882 −0.591647799 −1.93871621 −0.792807516 −1.740403662 0.095827511 1.654694339 0.139973164 0.228163305 0.49445282 189 −1.571456102 −0.03535016 −0.029616141 −0.01056462 0.362974998 1.958630717 −0.891335196 −1.994184623 1.103475032 1.118054516 −0.016880535 −1.020528486 −1.788254037 0.444353356 −0.401033934 −1.041119287 1.404575993 −1.55600796 −1.247078703 1.001883688 −1.131509813 −0.177842736 2.959316629 2.170773289 −1.167230557 −1.834502626 −0.655126713 −1.726791019 −0.404610054 −0.780398832 −0.03544518 −0.596809422 −1.921103123 −2.202435006 0.432258869 −0.114145157 −0.130785416 1.617686365 −0.430622067 −1.017272356 [...]... of standard Monte Carlo methods and some other classic Monte Carlo methods At the end, we will present a special type of SMC, the discrete particle filter (DPF) method 10 2.1 Sequential Monte Carlo Methods The basic idea of the standard Monte Carlo methods is: for some fixed n, if we are (i) able to sample N independent random variables Xn ∼ πn (xn ) for i ∈ {1, 2, N }, then the Monte Carlo method... Standard Monte Carlo Monte Carlo methods are the most popular numerical techniques to approximate the above target densities πn (xn ) in the past few decades; and more advanced Monte Carlo methods, for example sequential Monte Carlo (SMC) methods ([22, 30]), have arisen and been well studied in recent years In this section, we will give a review for a SMC methodology beginning with the introduction of... recent years, sequential Monte Carlo (SMC) methods are amongst the most widely used computational techniques in statistics, engineering, physics, finance and many other disciplines In this thesis, we make efforts on the development and applications of SMC methods for problems on finite state- spaces Firstly, we provide an exposition of exact computational methods to perform parameter inference for partially... contributions to the development and applications of the sequential Monte Carlo (SMC) methods ([21, 30, 22]) They have been found to out-perform Markov chain Monte Carlo (MCMC) in some situations The thesis will study the SMC method through solving some problems on finite state- spaces, including the approximation of the likelihood of network models; see Chapter 3; the calculation of permanents for binary... assumptions, the SMC method will have relative variance which can grow only polynomially In order to perform parameter estimation, we develop particle Markov chain Monte Carlo (PMCMC) algorithms to perform Bayesian inference Such algorithms use the afore-mentioned SMC algorithms within the transition dynamics The approaches are illustrated numerically xv List of Publications [2] J Wang and A Jasra Monte Carlo. .. non-negative entries; see Chapter 5 These three problems are of importance in a variety of practical applications, which will be illustrated later on Here we begin with a short introduction to the SMC method, then we will briefly describe our interested problems and their possible solutions in Section 1.2, also our contributions to these problems in Section 1.3 The last section will give an outline for. .. Figure 3.6.1 Figures for Convergence diagnostic: the LHS are PSRF plots; the RHS are variance estimation plots For marginal MCMC samples, plots (a) and (b) suggest that convergence is obtained around iteration 1200 for each Markov Chain; for the SMC version of PMCMC samples, plots (c) and (d) suggest that convergence is obtained around iteration 800 for each Markov Chain; for the DPF version of PMCMC samples,... Figure 3.7.5 Figures for Convergence diagnostic: for the SMC version of PMCMC samples: the LHS are PSRF plots; the RHS are variance estimation plots Plots (a) and (b) suggest that convergence is obtained around iteration 100 for each Markov Chain; for the combination of SMC and DPF version of PMCMC samples, plots (c) and (d) suggest that convergence is obtained around iteration 40 for each Markov Chain;... explanations about DA models and likelihood function of network models Then it is followed by detailed discussions of computational methods, 7 Chapter 1 Introduction IS, SMC and DPF for likelihood estimation and PMCMC for Bayesian inference We also consider numerical illustrations based on both designed and large data A short summary is provided at the end of this chapter • Chapter 4 is about the calculation... standard Monte Carlo method and the importance sampling method in the next two Subsections 2.1.2-2.1.3 Then after presenting the sequential importance sampling method and the resampling techniques in Subsections 2.1.4-2.1.5, the sequential Monte Carlo method is naturally illustrated in Subsection 2.1.6 Finally, the discrete particle filtering method is discussed in Subsection 2.1.7 as a extension of SMC . SEQUENTIAL MONTE CARLO METHODS FOR PROBLEMS ON FINITE STATE-SPACES WANG JUNSHAN NATIONAL UNIVERSITY OF SINGAPORE 2015 SEQUENTIAL MONTE CARLO METHODS FOR PROBLEMS ON FINITE STATE-SPACES WANG. efforts on the development and applications of SMC methods for problems on finite state-spaces. Firstly, we provide an exposition of exact computational methods to perform parameter inference for. Review 9 2.1 Sequential Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Notations and Objectives . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2 Standard Monte Carlo . .