Econometric theory and methods, Russell Davidson - Chapter 7 doc

Chapter 7 Generalized Least Squares and Related Topics 7.1 Introduction If the parameters of a regression model are to be estimated efficiently by least squares, the error terms must be uncorrelated and have the same variance. These assumptions are needed to prove the Gauss-Markov Theorem and to show that the nonlinear least squares estimator is asymptotically efficient; see Sections 3.5 and 6.3. Moreover, the usual estimators of the covariance matrices of the OLS and NLS estimators are not valid when these assumptions do not hold, although alternative “sandwich” covariance matrix estimators that are asymptotically valid may be available (see Sections 5.5, 6.5, and 6.8). Thus it is clear that we need new estimation methods to handle regression models with error terms that are heteroskedastic, serially correlated, or both. We develop some of these methods in this chapter. Since heteroskedasticity and serial correlation affect both linear and nonlinear regression models in the same way, there is no harm in limiting our attention to the simpler, linear case. We will be concerned with the model y = Xβ + u, E(uu  ) = Ω, (7.01) where Ω, the covariance matrix of the error terms, is a positive definite n × n matrix. If Ω is equal to σ 2 I, then (7.01) is just the linear regression model (3.03), with error terms that are uncorrelated and homoskedastic. If Ω is diagonal with nonconstant diagonal elements, then the error terms are still uncorrelated, but they are heteroskedastic. If Ω is not diagonal, then u i and u j are correlated whenever Ω ij , the ij th element of Ω, is nonzero. In econometrics, covariance matrices that are not diagonal are most commonly encountered with time-series data, and the correlations are usually highest for observations that are close in time. In the next section, we obtain an efficient estimator for the vector β in the model (7.01) by transforming the regression so that it satisfies the conditions of the Gauss-Markov theorem. This efficient estimator is called the generalized least squares, or GLS, estimator. Although it is easy to write down the GLS Copyright c  1999, Russell Davidson and James G. MacKinnon 255 256 Generalized Least Squares and Related Topics estimator, it is not always easy to compute it. In Section 7.3, we therefore discuss ways of computing GLS estimates, including the particularly simple case of weighted least squares. In the following section, we relax the often implausible assumption that the matrix Ω is completely known. Section 7.5 discusses some aspects of heteroskedasticity. Sections 7.6 through 7.9 deal with various aspects of serial correlation, including autoregressive and moving average processes, testing for serial correlation, GLS and NLS estimation of models with serially correlated errors, and specification tests for models with serially correlated errors. Finally, Section 7.10 discusses error-components models for panel data. 7.2 The GLS Estimator In order to obtain an efficient estimator of the parameter vector β of the linear regression model (7.01), we transform the model so that the transformed model satisfies the conditions of the Gauss-Markov theorem. Estimating the transformed model by OLS therefore yields efficient estimates. The transfor- mation is expressed in terms of an n×n matrix Ψ , which is usually triangular, that satisfies the equation Ω −1 = Ψ Ψ  . (7.02) As we discussed in Section 3.4, such a matrix can always be found, often by using Crout’s algorithm. Premultiplying (7.01) by Ψ  gives Ψ  y = Ψ  Xβ + Ψ  u. (7.03) Because the covariance matrix Ω is nonsingular, the matrix Ψ must be as well, and so the transformed regression model (7.03) is perfectly equivalent to the original model (7.01). The OLS estimator of β from regression (7.03) is ˆ β GLS = (X  Ψ Ψ  X) −1 X  Ψ Ψ  y = (X  Ω −1 X) −1 X  Ω −1 y. (7.04) This estimator is called the generalized least squares, or GLS, estimator of β. It is not difficult to show that the covariance matrix of the transformed error vector Ψ  u is simply the identity matrix: E(Ψ  uu  Ψ ) = Ψ  E(uu  )Ψ = Ψ  Ω Ψ = Ψ  (Ψ Ψ  ) −1 Ψ = Ψ  (Ψ  ) −1 Ψ −1 Ψ = I. The second equality in the second line here uses a result about the inverse of a product of square matrices that was proved in Exercise 1.15. Since ˆ β GLS is just the OLS estimator from (7.03), its covariance matrix can be found directly from the standard formula for the OLS covariance matrix, expression (3.28), if we replace X by Ψ  X and σ 2 0 by 1: Var( ˆ β GLS ) = (X  Ψ Ψ  X) −1 = (X  Ω −1 X) −1 . (7.05) Copyright c  1999, Russell Davidson and James G. MacKinnon 7.2 The GLS Estimator 257 In order for (7.05) to be valid, the conditions of the Gauss-Markov theorem must be satisfied. Here, this means that Ω must be the covariance matrix of u conditional on the explanatory variables X. It is thus permissible for Ω to depend on X, or indeed on any other exogenous variables. The generalized least squares estimator ˆ β GLS can also be obtained by minimizing the GLS criterion function (y − Xβ)  Ω −1 (y − Xβ), (7.06) which is just the sum of squared residuals from the transformed regression (7.03). This criterion function can be thought of as a generalization of the SSR function in which the squares and cross products of the residuals from the original regression (7.01) are weighted by the inverse of the matrix Ω. The effect of such a weighting scheme is clearest when Ω is a diagonal matrix: In that case, each observation is simply given a weight proportional to the inverse of the variance of its error term. Efficiency of the GLS Estimator The GLS estimator ˆ β GLS defined in (7.04) is also the solution of the set of moment conditions X  Ω −1 (y − X ˆ β GLS ) = 0. (7.07) These moment conditions are equivalent to the first-order conditions for the minimization of the GLS criterion function (7.06). Since the GLS estimator is a method of moments estimator, it is interesting to compare it with other MM estimators. A general MM estimator for the linear regression model (7.01) is defined in terms of an n × k matrix of exogenous variables W, where k is the dimension of β, by the equations W  (y − Xβ) = 0. (7.08) These equations are a special case of the moment conditions (6.10) for the nonlinear regression model. Since there are k equations and k unknowns, we can solve (7.08) to obtain the MM estimator ˆ β W ≡ (W  X) −1 W  y. (7.09) The GLS estimator (7.04) is evidently a special case of this MM estimator, with W = Ω −1 X. Under certain assumptions, the MM estimator (7.09) is unbiased for the model (7.01). Suppose that the DGP is a sp ecial case of that model, with parameter vector β 0 and known covariance matrix Ω. We assume that X and W are exogenous, which implies that E(u | X, W ) = 0. This rather strong assumption, which is analogous to the assumption (3.08), is necessary for the unbiasedness of ˆ β W and makes it unnecessary to resort to asymptotic analysis. If we merely Copyright c  1999, Russell Davidson and James G. MacKinnon 258 Generalized Least Squares and Related Topics wanted to prove that ˆ β W is consistent, we could, as in Section 6.2, get away with the much weaker assumption that E(u t | W t ) = 0. Substituting Xβ 0 + u for y in (7.09), we see that ˆ β W = β 0 + (W  X) −1 W  u. Therefore, the covariance matrix of ˆ β W is Var( ˆ β W ) = E  ( ˆ β W − β 0 )( ˆ β W − β 0 )   = E  (W  X) −1 W  uu  W (X  W ) −1  = (W  X) −1 W  Ω W (X  W ) −1 . (7.10) As we would expect, this is a sandwich covariance matrix. When W = X, we have the OLS estimator, and Var( ˆ β W ) reduces to expression (5.32). The efficiency of the GLS estimator can be verified by showing that the difference between (7.10), the covariance matrix for the MM estimator ˆ β W defined in (7.09), and (7.05), the covariance matrix for the GLS estimator, is a positive semidefinite matrix. As was shown in Exercise 3.8, this difference will be positive semidefinite if and only if the difference between the inverse of (7.05) and the inverse of (7.10), that is, the matrix X  Ω −1 X − X  W (W  Ω W ) −1 W  X, (7.11) is positive semidefinite. In exercise 7.2, readers are invited to show that this is indeed the case. The GLS estimator ˆ β GLS is typically more efficient than the more general MM estimator ˆ β W for all elements of β, because it is only in very sp ecial cases that the matrix (7.11) will have any zero diagonal elements. Because the OLS estimator ˆ β is just ˆ β W when W = X, we conclude that the GLS estimator ˆ β GLS will in most cases be more efficient, and will never be less efficient, than the OLS estimator ˆ β. 7.3 Computing GLS Estimates At first glance, the formula (7.04) for the GLS estimator seems quite simple. To calculate ˆ β GLS when Ω is known, we apparently just have to invert Ω, form the matrix X  Ω −1 X and invert it, then form the vector X  Ω −1 y, and, finally, postmultiply the inverse of X  Ω −1 X by X  Ω −1 y. However, GLS estimation is not nearly as easy as it looks. The procedure just described may work acceptably when the sample size n is small, but it rapidly becomes computationally infeasible as n becomes large. The problem is that Ω is an n × n matrix. When n = 1000, simply storing Ω and its inverse will typically require 16 MB of memory; when n = 10, 000, storing both these matrices Copyright c  1999, Russell Davidson and James G. MacKinnon 7.3 Computing GLS Estimates 259 will require 1600 MB. Even if enough memory were available, computing GLS estimates in this naive way would be enormously expensive. Practical procedures for GLS estimation require us to know quite a lot about the structure of the covariance matrix Ω and its inverse. GLS estimation will be easy to do if the matrix Ψ , defined in (7.02), is known and has a form that allows us to calculate Ψ  x, for any vector x, without having to store Ψ itself in memory. If so, we can easily formulate the transformed model (7.03) and estimate it by OLS. There is one important difference between (7.03) and the usual linear regression model. For the latter, the variance of the error terms is unknown, while for the former, it is known to be 1. Since we can obtain OLS estimates without knowing the variance of the error terms, this suggests that we should not need to know everything about Ω in order to obtain GLS estimates. Suppose that Ω = σ 2 ∆, where the n × n matrix ∆ is known to the investigator, but the positive scalar σ 2 is unknown. Then if we replace Ω by ∆ in the definition (7.02) of Ψ, we can still run regression (7.03), but the error terms will now have variance σ 2 instead of variance 1. When we run this modified regression, we will obtain the estimate (X  ∆ −1 X) −1 X  ∆ −1 y = (X  Ω −1 X) −1 X  Ω −1 y = ˆ β GLS , where the equality follows immediately from the fact that σ 2 /σ 2 = 1. Thus the GLS estimates will be the same whether we use Ω or ∆, that is, whether or not we know σ 2 . However, if σ 2 is known, we can use the true covariance matrix (7.05). Otherwise, we must fall back on the estimated covariance matrix  Var( ˆ β GLS ) = s 2 (X  ∆ −1 X) −1 , where s 2 is the usual OLS estimate (3.49) of the error variance from the transformed regression. Weighted Least Squares It is particularly easy to obtain GLS estimates when the error terms are heteroskedastic but uncorrelated. This implies that the matrix Ω is diagonal. Let ω 2 t denote the t th diagonal element of Ω. Then Ω −1 is a diagonal matrix with t th diagonal element ω −2 t , and Ψ can be chosen as the diagonal matrix with t th diagonal element ω −1 t . Thus we see that, for a typical observation, regression (7.03) can be written as ω −1 t y t = ω −1 t X t β + ω −1 t u t . (7.12) This regression is to be estimated by OLS. The regressand and regressors are simply the dependent and independent variables multiplied by ω −1 t , and the variance of the error term is clearly 1. Copyright c  1999, Russell Davidson and James G. MacKinnon 260 Generalized Least Squares and Related Topics For obvious reasons, this special case of GLS estimation is often called weighted least squares, or WLS. The weight given to each observation when we run regression (7.12) is ω −1 t . Observations for which the variance of the error term is large are given low weights, and observations for which it is small are given high weights. In practice, if Ω = σ 2 ∆, with ∆ known but σ 2 unknown, regression (7.12) remains valid, provided we reinterpret ω 2 t as the t th diagonal element of ∆ and recognize that the variance of the error terms is now σ 2 instead of 1. There are various ways of determining the weights used in weighted least squares estimation. In the simplest case, either theory or preliminary testing may suggest that E(u 2 t ) is proportional to z 2 t , where z t is some variable that we observe. For example, z t might be a variable like population or national income. In this case, z t plays the role of ω t in equation (7.12). Another possibility is that the data we actually observe were obtained by grouping data on different numbers of individual units. Suppose that the error terms for the ungrouped data have constant variance, but that observation t is the average of N t individual observations, where N t varies. Special cases of standard results, discussed in Section 3.4, on the variance of a sample mean imply that the variance of u t will then be proportional to 1/N t . Thus, in this case, N −1/2 t plays the role of ω t in equation (7.12). Weighted least squares estimation can easily be performed using any program for OLS estimation. When one is using such a procedure, it is important to remember that all the variables in the regression, including the constant term, must be multiplied by the same weights. Thus if, for example, the original regression is y t = β 1 + β 2 X t + u t , the weighted regression will be y t /ω t = β 1 (1/ω t ) + β 2 (X t /ω t ) + u t /ω t . Here the regressand is y t /ω t , the regressor that corresponds to the constant term is 1/ω t , and the regressor that corresponds to X t is X t /ω t . It is possible to report summary statistics like R 2 , ESS, and SSR either in terms of the dependent variable y t or in terms of the transformed regressand y t /ω t . However, it really only makes sense to report R 2 in terms of the transformed regressand. As we saw in Section 2.5, R 2 is valid as a measure of goodness of fit only when the residuals are orthogonal to the fitted values. This will be true for the residuals and fitted values from OLS estimation of the weighted regression (7.12), but it will not be true if those residuals and fitted values are subsequently multiplied by the ω t in order to make them comparable with the original dependent variable. Copyright c  1999, Russell Davidson and James G. MacKinnon 7.3 Computing GLS Estimates 261 Generalized Nonlinear Least Squares Although, for simplicity, we have focused on the linear regression model, GLS is also applicable to nonlinear regression models. If the vector of regression functions were x(β) instead of Xβ, we could obtain generalized nonlinear least squares, or GNLS, estimates by minimizing the criterion function  y − x(β)   Ω −1  y − x(β)  , (7.13) which looks just like the GLS criterion function (7.06) for the linear regression model, except that x(β) replaces Xβ. If we differentiate (7.13) with respect to β and divide the result by −2, we obtain the moment conditions X  (β)Ω −1  y − x(β)  = 0, (7.14) where, as in Chapter 6, X(β) is the matrix of derivatives of x(β) with respect to β. These moment conditions generalize conditions (6.27) for nonlinear least squares in the obvious way, and they are evidently equivalent to the moment conditions (7.07) for the linear case. Finding estimates that solve equations (7.14) will require some sort of nonlinear minimization procedure; see Section 6.4. For this purpose, and several others, the GNR Ψ   y − x(β)  = Ψ  X(β)b + residuals. (7.15) will often be useful. Equation (7.15) is just the ordinary GNR introduced in equation (6.52), with the regressand and regressors premultiplied by the matrix Ψ  implicitly defined in equation (7.02). It is the GNR associated with the nonlinear regression model Ψ  y = Ψ  x(β) + Ψ  u, (7.16) which is analogous to (7.03). The error terms of (7.16) have covariance matrix proportional to the identity matrix. Let us denote the t th column of the matrix Ψ by ψ t . Then the asymptotic theory of Chapter 6 for the nonlinear regression model and the ordinary GNR applies also to the transformed regression model (7.16) and its associated GNR (7.15), provided that the transformed regression functions ψ t  x(β) are predetermined with respect to the transformed error terms ψ t  u: E  ψ t  u | ψ t  x(β)  = 0. (7.17) If Ψ is not a diagonal matrix, this condition is different from the condition that the regression functions x t (β) should be predetermined with respect to the u t . Later in this chapter, we will see that this fact has serious repercussions in models with serial correlation. Copyright c  1999, Russell Davidson and James G. MacKinnon 262 Generalized Least Squares and Related Topics 7.4 Feasible Generalized Least Squares In practice, the covariance matrix Ω is often not known even up to a scalar factor. This makes it impossible to compute GLS estimates. However, in many cases it is reasonable to suppose that Ω, or ∆, depends in a known way on a vector of unknown parameters γ. If so, it may be possible to estimate γ consistently, so as to obtain Ω( ˆ γ), say. Then Ψ ( ˆ γ) can be defined as in (7.02), and GLS estimates computed conditional on Ψ ( ˆ γ). This type of procedure is called feasible generalized least squares, or feasible GLS, because it is feasible in many cases when ordinary GLS is not. As a simple example, suppose we want to obtain feasible GLS estimates of the linear regression model y t = X t β + u t , E(u 2 t ) = exp(Z t γ), (7.18) where β and γ are, respectively, a k vector and an l vector of unknown parameters, and X t and Z t are conformably dimensioned row vectors of observations on exogenous or predetermined variables that belong to the information set on which we are conditioning. Some or all of the elements of Z t may well belong to X t . The function exp(Z t γ) is an example of a skedastic function. In the same way that a regression function determines the conditional mean of a random variable, a skedastic function determines its conditional variance. The skedastic function exp(Z t γ) has the property that it is positive for any vector γ. This is a desirable property for any skedastic function to have, since negative estimated variances would be highly inconvenient. In order to obtain consistent estimates of γ, usually we must first obtain consistent estimates of the error terms in (7.18). The obvious way to do so is to start by computing OLS estimates ˆ β. This allows us to calculate a vector of OLS residuals with typical element û t . We can then run the auxiliary linear regression log û 2 t = Z t γ + v t , (7.19) over observations t = 1, . . . , n to find the OLS estimates ˆ γ. These estimates are then used to compute ˆω t =  exp(Z t ˆ γ)  1/2 for all t. Finally, feasible GLS estimates of β are obtained by using ordinary least squares to estimate regression (7.12), with the estimates ˆω t replacing the unknown ω t . This is an example of feasible weighted least squares. Why Feasible GLS Works Under suitable regularity conditions, it can be shown that this type of procedure yields a feasible GLS estimator ˆ β F that is consistent and asymptotically equivalent to the GLS estimator ˆ β GLS . We will not attempt to provide a Copyright c  1999, Russell Davidson and James G. MacKinnon 7.4 Feasible Generalized Least Squares 263 rigorous proof of this proposition; for that, see Amemiya (1973a). However, we will try to provide an intuitive explanation of why it is true. If we substitute Xβ 0 +u for y into expression (7.04), the formula for the GLS estimator, we find that ˆ β GLS = β 0 + (X  Ω −1 X) −1 X  Ω −1 u. Taking β 0 over to the left-hand side, multiplying each factor by an appropriate power of n, and taking probability limits, we see that n 1/2 ( ˆ β GLS − β 0 ) a =  plim n→∞ 1 − n X  Ω −1 X  −1  plim n→∞ n −1/2 X  Ω −1 u  . (7.20) Under standard assumptions, the first matrix on the right-hand side is a nonstochastic k ×k matrix with full rank, while the vector that postmultiplies it is a stochastic vector which follows the multivariate normal distribution. For the feasible GLS estimator, the analog of (7.20) is n 1/2 ( ˆ β F − β 0 ) a =  plim n→∞ 1 − n X  Ω −1 ( ˆ γ)X  −1  plim n→∞ n −1/2 X  Ω −1 ( ˆ γ)u  . (7.21) The right-hand sides of expressions (7.21) and (7.20) look very similar, and it is clear that the latter will be asymptotically equivalent to the former if plim n→∞ 1 − n X  Ω −1 ( ˆ γ)X = plim n→∞ 1 − n X  Ω −1 X (7.22) and plim n→∞ n −1/2 X  Ω −1 ( ˆ γ)u = plim n→∞ n −1/2 X  Ω −1 u. (7.23) A rigorous statement and proof of the conditions under which equations (7.22) and (7.23) hold is beyond the scope of this book. If they are to hold, it is desirable that ˆ γ should be a consistent estimator of γ, and this requires that the OLS estimator ˆ β should be consistent. For example, it can be shown that the estimator obtained by running regression (7.19) would be consistent if the regressand depended on u t rather than û t . Since the regressand is actually û t , it is necessary that the residuals û t should consistently estimate the error terms u t . This in turn requires that ˆ β should be consistent for β 0 . Thus, in general, we cannot expect ˆ γ to be consistent if we do not start with a consistent estimator of β. Unfortunately, as we will see later, if Ω(γ) is not diagonal, then the OLS estimator ˆ β is, in general, not consistent whenever any element of X t is a lagged dependent variable. A lagged dependent variable is predetermined with respect to error terms that are innovations, but not with respect to error terms that are serially correlated. With GLS or feasible GLS estimation, the problem Copyright c  1999, Russell Davidson and James G. MacKinnon 264 Generalized Least Squares and Related Topics does not arise, because, if the model is correctly specified, the transformed explanatory variables are predetermined with respect to the transformed error terms, as in (7.17). When the OLS estimator is inconsistent, we will have to obtain a consistent estimator of γ in some other way. Whether or not feasible GLS is a desirable estimation method in practice depends on how good an estimate of Ω can be obtained. If Ω( ˆ γ) is a very good estimate, then feasible GLS will have essentially the same properties as GLS itself, and inferences based on the GLS covariance matrix (7.05), with Ω( ˆ γ) replacing Ω, should be reasonably reliable, even though they will not be exact in finite samples. Note that condition (7.22), in addition to being necessary for the validity of feasible GLS, guarantees that the feasible GLS covariance matrix estimator converges as n → ∞ to the true GLS covariance matrix. On the other hand, if Ω( ˆ γ) is a poor estimate, feasible GLS estimates may have quite different properties from real GLS estimates, and inferences may be quite misleading. It is entirely possible to iterate a feasible GLS procedure. The estimator ˆ β F can be used to compute new set of residuals, which can then be used to obtain a second-round estimate of γ, which can be used to calculate second-round feasible GLS estimates, and so on. This procedure can either be stopp ed after a predetermined number of rounds or continued until convergence is achieved (if it ever is achieved). Iteration does not change the asymptotic distribution of the feasible GLS estimator, but it does change its finite-sample distribution. Another way to estimate models in which the covariance matrix of the error terms depends on one or more unknown parameters is to use the method of maximum likelihood. This estimation method, in which β and γ are estimated jointly, will be discussed in Chapter 10. In many cases, an iterated feasible GLS estimator will be the same as a maximum likelihood estimator based on the assumption of normally distributed errors. 7.5 Heteroskedasticity There are two situations in which the error terms are heteroskedastic but serially uncorrelated. In the first, the form of the heteroskedasticity is completely unknown, while, in the second, the skedastic function is known except for the values of some parameters that can be estimated consistently. Concerning the case of heteroskedasticity of unknown form, we saw in Sections 5.5 and 6.5 how to compute asymptotically valid covariance matrix estimates for OLS and NLS parameter estimates. The fact that these HCCMEs are sandwich covariance matrices makes it clear that, although they are consistent under standard regularity conditions, neither OLS nor NLS is efficient when the error terms are heteroskedastic. If the variances of all the error terms are known, at least up to a scalar factor, then efficient estimates can be obtained by weighted least squares, Copyright c  1999, Russell Davidson and James G. MacKinnon [...]... n and k ; see Savin and White (1 977 ) The standard tables, which are deliberately not printed in this book, contain bounds for one-tailed DW tests of the null hypothesis that ρ ≤ 0 against Copyright c 1999, Russell Davidson and James G MacKinnon 7. 7 Testing for Serial Correlation 279 the alternative that ρ > 0 An investigator will reject the null hypothesis if d < dL , fail to reject if d > dU , and. ..  1 0 0 (7. 49) 0 1 0 It is easy to see that (Lu)t = ut−1 for t = 2, , n, and (Lu)1 = 0 With this ˜ ˜ definition, the numerator of (7. 47) becomes n−1 u Lu = n−1 u MX LMX u, of which the expectation, by a similar argument to that used above, is n−1 E Tr(MX LMX uu ) = n−1 Tr(MX LMX Ω) Copyright c 1999, Russell Davidson and James G MacKinnon (7. 50) 7. 7 Testing for Serial Correlation 277 When MX is... testing the ˜ null hypothesis that γ = 0, regression (7. 26) is equivalent to the regression u2 = bδ + Zt bγ + residual, t Copyright c 1999, Russell Davidson and James G MacKinnon (7. 27) 7. 5 Heteroskedasticity 2 67 with a suitable redefinition of the artificial parameters bδ and bγ Observe that regression (7. 27) does not depend on the functional form of h(·) Standard results for tests based on the GNR imply... X ˜ 1 (7. 51) and the t statistic from the simple regression (7. 46) may be written as tSR = ˜ ˜ n−1/2 u u1 , −1 u u )1/2 s(n ˜ 1 ˜ 1 ´ (7. 52) where s and s are the square roots of the estimated error variances for (7. 43) ´ and (7. 46), respectively Of course, the factors of n in the numerators and denominators of (7. 51) and (7. 52) cancel out and may be ignored for any purpose except asymptotic analysis... regression ut = Xt b + bρ1 ut−1 + + bρp ut−p + residual ˜ ˜ ˜ (7. 45) and use an asymptotic F test of the null hypothesis that the coefficients on all the lagged residuals are zero; see Exercise 7. 6 Of course, in order to run Copyright c 1999, Russell Davidson and James G MacKinnon 7. 7 Testing for Serial Correlation 275 regression (7. 45), we will either need to drop the first p observations or replace... denominator of (7. 47) as n−1 u MX u, ñ Copyright c 1999, Russell Davidson and James G MacKinnon 276 Generalized Least Squares and Related Topics where, as usual, the orthogonal projection matrix MX projects on to S⊥ (X) If the vector u is generated by a stationary AR(1) process, it can be shown that a law of large numbers can be applied to both the numerator and the denominator of (7. 47) Thus, asymptotically,... simplest of these is the first-order moving average, or MA(1), process ut = εt + α1 εt−1 , 1 2 εt ∼ IID(0, σε ), (7. 37) For a complex number a + bi, a and b real, the absolute value is (a2 + b2 )1/2 Copyright c 1999, Russell Davidson and James G MacKinnon 272 Generalized Least Squares and Related Topics in which the error term ut is a weighted average of two successive innovations, εt and εt−1 It is not difficult... has been updated as Box, Jenkins, and Reinsel (1994) Books that are specifically aimed at economists include Granger and Newbold (1986), Harvey (1989), Hamilton (1994), and Hayashi (2000) Copyright c 1999, Russell Davidson and James G MacKinnon 7. 7 Testing for Serial Correlation 273 7. 7 Testing for Serial Correlation Over the decades, an enormous amount of research has been devoted to the subject of... Gauss-Newton regression Let β denote the vector of OLS estimates obtained from the restricted model y = Xβ + u, (7. 42) ˜ and let u denote the vector of OLS residuals from this regression Then, as we saw in Section 6 .7, the GNR for testing the null hypothesis that ρ = 0 is ˜ ˜ u = Xb + bρ u1 + residuals, Copyright c 1999, Russell Davidson and James G MacKinnon (7. 43) 274 Generalized Least Squares and. .. = (1 − ρ2 )1/2 u1 , (7. 58) it can be seen that the n vector ε, with the first component ε1 defined by (7. 58) and the remaining components εt defined by (7. 57) , has a covar2 iance matrix equal to σε I Putting together (7. 57) and (7. 58), we conclude that Ψ should be defined as an n × n matrix with all diagonal elements equal to 1 except for the first, which is equal to (1 − ρ2 )1/2, and all other elements . (1986), Harvey (1989), Hamilton (1994), and Hayashi (2000). Copyright c  1999, Russell Davidson and James G. MacKinnon 7. 7 Testing for Serial Correlation 273 7. 7 Testing for Serial Correlation Over. 0, regression (7. 26) is equivalent to the regression u 2 t = b δ + Z t b γ + residual, (7. 27) Copyright c  1999, Russell Davidson and James G. MacKinnon 7. 5 Heteroskedasticity 2 67 with a suitable. (X  Ω −1 X) −1 . (7. 05) Copyright c  1999, Russell Davidson and James G. MacKinnon 7. 2 The GLS Estimator 2 57 In order for (7. 05) to be valid, the conditions of the Gauss-Markov theorem must

Econometric theory and methods, Russell Davidson - Chapter 7 doc

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan