Class Notes in Statistics and Econometrics Part 34 pps

CHAPTER 67 Timeseries Analysis A time series y with typical element y s is a (finite or infinite) sequence of random variables. Usually, the subscript s goes from 1 to ∞, i.e., the time series is written y 1 , y 2 , . . ., but it may have different (finite or inifinite) starting or ending values. 67.1. Covariance Stationary Timeseries A time series is covariance-stationary if and only if: E[y s ] = µ for all s(67.1.1) var[y s ] < ∞ for all s(67.1.2) cov[y s , y s+k ] = γ k for all s and k(67.1.3) 1435 1436 67. TIMESERIES ANALYSIS I.e., the means do not depend on s, and the covariances only depend on the distances and not on s. A covariance stationary time series is characterized by the expected value of each observation µ, the variance of each observation σ 2 , and the “autocorrelation function” ρ k for k ≥ 1 or, alternatively, by µ and the “autocovariance function” γ k for k ≥ 0. The autocovariance and autocorrelation functions are vectors containing the unique elements of the covariance and correlation matrices. The simplest time series has all y t ∼ IID(µ, σ 2 ), i.e., all covariances between different elements are zero. If µ = 0 this is called “white noise.” A covariance-stationary process y t (t = 1, . . . , n) with expected value µ = E[y i ] is said to be ergodic for the mean if (67.1.4) plim n→∞ 1 n n  t=1 y t = µ. We will usually require ergodicity along with stationarity. Problem 548. [Ham94, pp. 46/7] Give a simple example for a stationary time series process which is not ergodic for the mean. Answer. White noise plus a mean which is drawn once and for all from a N(0, τ 2 ) independent of the white noise.  67.1. COVARIANCE STATIONARY TIMESERIES 1437 67.1.1. Moving Average Processes. The following is based on [Gra89, pp. 63–91] and on [End95]. We just said that the simplest stationary pro c es s is a constant plus “white noise” (all autocorrelations zero). The next simplest process is a moving average process of order 1, also called a MA(1) process: y t = µ + ε t + βε t−1 ε t ∼ IID(0, σ 2 )(67.1.5) where the first y, say it is y 1 , depends on the pre-sample ε 0 . Problem 549. Compute the autocovariance and autocorrelation function of the time series defined in (67.1.5), and show that the following process (67.1.6) y t = µ + η t + 1 β η t−1 η t ∼ IID(0, β 2 σ 2 ) generates a timeseries with equal statistical properties as (67.1.5). Answer. (67.1.5): var[y t ] = σ 2 (1 + β 2 ), cov[y t , y t−1 ] = βσ 2 , and cov[y t , y t−h ] = 0 for h > 1. corr[y t , y t−1 ] = β/(1 − β 2 ). (67.1.6) gives the same variance β 2 σ 2 (1 + 1/β 2 ) = σ 2 (1 + β 2 ) and the same correlation (1/β)/(1 + 1/β 2 ) = β/(1 + β 2 )  The moving-average representation of a timeseries is therefore not unique. It is not possible to tell from observation of the time series alone whether the process generating it was (67.1.5) or ( 67.1.6). One can say in general that unless β = 1 1438 67. TIMESERIES ANALYSIS every MA(1) process could have been generated by a process in which |β| < 1. This process is called the invertible form or the fundamental representation of the time series. Problem 550. What are the implications for estimation of the fact that a MA- process can have different data-generating processes? Answer. Besides looking how the timeseries fits the data, the econometrician should also lo ok whether the disturbances are plausible values in light of the actual history of the process, in order to ascertain that one is using the right representation.  The fundamental representation of the time series is needed for forecasting. Let us first look at the simplest situation: the time series at hand is generated by the process (67.1.5) with |β| < 1, the parameters µ and β are known, and one wants to forecast y t+1 on the basis of all past and present observations. Clearly, the past and present has no information ab out ε t+1 , therefore the best we can hop e to do is to forecast y t+1 by µ + βε t . But do we know ε t ? If a time series is generated by an invertible process, then someone who knows µ, β, and the current and all past values of y can use this to 67.1. COVARIANCE STATIONARY TIMESERIES 1439 reconstruct the value of the current disturbance. One sees this as follows: y t = µ + ε t + βε t−1 (67.1.7) ε t = y t − µ − βε t−1 (67.1.8) ε t−1 = y t−1 − µ − βε t−2 (67.1.9) ε t = y t − µ − β(y t−1 − µ − βε t−2 )(67.1.10) = −µ(1 −β) + y t − βy t−1 + β 2 ε t−2 (67.1.11) after the next step ε t = −µ(1 −β + β 2 ) + y t − βy t−1 + β 2 y t−2 − β 3 ε t−3 (67.1.12) and after t steps ε t = −µ  1 −β + β 2 − ···+ (−β) t−1  (67.1.13) + y t − βy t−1 + β 2 y t−2 − ···+ (−β) t−1 y 1 + (−β) t ε 0 (67.1.14) = −µ 1 + (−β) t 1 + β + t−1  i=0 (−β) i y t−i + (−β) t ε 0 (67.1.15) 1440 67. TIMESERIES ANALYSIS If |β| < 1, the last term of the right hand side, which depends on the unobservable ε 0 , becomes less and less important. Therefore, if µ and β are known, and all past values of y t are known, this is enough information to compute the value of the present disturbance ε t . Equation (67.1.15) can be considered the “inversion” of the MA1-process, i.e., its representation as an infinite autoregressive process. The disturbance in the invertible process is called the “fundamental innovation” because every y t is composed of a part which is determined by the history y t−1 , y t−2 , . . . plus ε t which is new to the present period. The invertible representation can therefore be used for forecasting: the best predictor of y t+1 is µ + βε t . Even if a time series was actually generated by a non-invertible process, the formula based on the invertible process is still the best formula for prediction, but now it must be given a different interpretation. All this can be generalized for higher order MA processes. [Ham94, pp. 64–68] says: for any noninvertible MA process (which is not b orderline in the sense that |β| = 1) there is an invertible MA process which has same means, variances, and autocorrelations. It is called the “fundamental representation” of this process. The fundamental representation of a process is the one which leads to very simple equations for forecasting. It used to be a matter of course to assume at the 67.1. COVARIANCE STATIONARY TIMESERIES 1441 same time that also the true process which generated the timeseries must be an invertible process, although the reasons given to justify this assumption were usually vague. The classic monograph [BJ76, p. 51] says, for instance: “The requirement of invertibility is needed if we are interested in asso ciating present events with past happ e nings in a sensible manner.” [Dea92, p. 85] justifies the requirement of invertibility as follows: “Without [invertibility] the consumer would have no way of calculating the innovation from current and past values of income.” But recently it has been discovered that certain economic models naturally lead to non-invertible data generating processes, see problem 552. This is a process in which the economic agents observe and act upon information which the econometrician cannot observe. If one goes over to infinite MA processes, then one gets all indeterministic stationary processes. According to the so-called Wold decomposition, every stationary process can be represented as a (possibly infinite) moving average process plus a “linearly deterministic” term, i.e., a term which can be linearly predicted without error from its past. There is consensus that economic time series do not contain such linearly deterministic terms. The errors in the infinite Moving Average representation also have to do with prediction: can be considered the errors in the best one-step ahead linear prediction based on the infinite past [Rei93, p. 7]. 1442 67. TIMESERIES ANALYSIS A stationary process without a linear deterministic term has therfore the form y t = µ + ∞  j=0 ψ j ε t−j (67.1.16) or, in vector notation y = ιµ + ∞  j=0 ψ j B j ε ε ε(67.1.17) where the timeseries ε s is white noise, and B is the backshift operator satisfying e  t B = e  t−1 (here e t is the tth unit vector which picks out the tth element of the time series). The coefficients satiisfy  ψ 2 i < ∞, and if they satisfy the stronger condition  |ψ i | < ∞, then the process is called causal. Problem 551. Show that without loss of generality ψ 0 = 1 in (67.1.16). Answer. If say ψ k is the first nonzero ψ, then simply write η j = ψ k ε ε ε j+k  Dually, one can also represent each fully indeterministic stationary processs as an infinite AR-process y t −µ +  p j=1 φ i (y t−i −µ) = ε t . This representation is called invertible if it satisfies  |θ i | < ∞. 67.1. COVARIANCE STATIONARY TIMESERIES 1443 67.1.2. The Box Jenkins Approach. Now as sume that the operator Ψ(B) =  ∞ j=0 ψ j B j can be written as the product Ψ = Φ −1 Θ where each Φ and Θ are finite polynomials in B. Again, without loss of generality, the leading coefficients in Ψ and Θ can be assumed to be = 1. Then the time series can be written (67.1.18) y t − µ + p  j=1 φ i (y t−i − µ) = ε t + ∞  j=1 θ j ε t−j A process is an ARMA-process if it satisfies this relation, regardless of whether the process y t is stationary or not. See [Rei93, p. 8]. Again, there may be more than one such representation for a given process. The Box-Jenkins approach is based on the assumption that empirically occurring stationary timeseries can be modeled as low-order ARMA processes. This would for instance be the case if the time series is built up recursively from its own past, with innovations which extend over more than one period. If this general assumption is satisfied, this has the following implications for methodology: • Some simple procedures have been developed how to recognize which of these time series one is dealing with. • In the case of autoregressive time series, estimation is extremely simple and can be done using the regression framework. 1444 67. TIMESERIES ANALYSIS 67.1.3. Moving Average Processes. In order to see what order a finite moving average process is, one should look at the correlation coefficients. If the order is j, then the theoretical correlation coefficients are zero for all values > j, and therefore the estimates of these correlation coefficients, which have the form (67.1.19) r k =  n t=k+1 (y t − ¯y)(y t−k − ¯y)  n t=1 (y t − ¯y) 2 must be insignificant. For estimation the preferred estimate is the maximum likelihood estimate. It can not be represented in closed form, therefore we have to rely on numerical maxi- mization procedures. 67.1.4. Autoregressive Processes. The common wisdom in econometrics is that economic time series are often built up recursively from their own past. Example of an AR(1) process is (67.1.20) y t = αy t−1 + ε t where the first observation, say it is y 1 , depends on the pre-sample y 0 . (67.1.20) is called a difference equation. [...]... independent Consumers know which part of their income is transitory and which part is permanent; they have this information because they know their own particular circumstances, but this kind of information is not directly available to the econometrician Consumers act on their privileged information: their increase in consumption is all of their increase in permanent income plus fraction β < 1 of their 67.2... transpired in the data from which the parameters were estimated (2) Innovations are correlated, and if you increase one without increasing another which is highly correlated with it then you may get misleading results 1456 67 TIMESERIES ANALYSIS Way out would be: transform the innovations in such a way that their estimated covariance matrix is diagonal, and only experiment with these diagonalized innovations... non-invertible VARMA process It is from [AG97, p 119], originally in [Qua90] and [BQ89] Income at time t is the sum of a permanent and a transitory component y t = y p t + y t t ; the permanent follows a random walk y p t = y p t−1 + δ t while the transitory income is white noise, i.e., y t t = εt var[εt ] = var[δ t ] = σ 2 , and all disturbances are mutually independent Consumers know which part. .. autoregressive process A useful tool for this are the partial autocorrelation coefficients We discussed partial correlation coefficients in chapter 19 The kth partial autocorrelation coefficient is the correlation between y t and y t−k with the in uence of the invervening lags partialled out The kth sample partial autocorrelation coefficient is the last coefficient in the regression of the timeseries on its first k... deviation) This is the so-called z-statistic 67.4 Cointegration Two timeseries y 0 and y 1 which are I(1) are called co-integrated if there is a linear combination of them which is I(0) What this means is especially obvious if this linear combination is their difference, see the graphs in [CD97, pp, 123/4] Usually in economic applications this linear combination also depends on exogenous variables; then... of the residuals Now if there is cointegration, the cointegrating relationship is stronger than the spurious regression effect Since cointegrated variables are usually jointly determined, there will be correlation between the error term and the regressor y 2 in the above regression However the coefficient estimate itself is super-consistent, i.e., instead of approaching the true value at a rate of t−1/2... been in the absence of the seasonal in uences 1467 1468 68 SEASONAL ADJUSTMENT Seasonal adjustment has been criticized in [Hyl92, p 231] on the grounds that it cannot be explained what the adjusted series is measuring Signal extraction in electrical engineering has the goal to restore the original signal which actually existed before it was degraded by noise But there is no actually existing “original... namely, the underlying economic mechanisms which would also have been active in the absence of the seasonal factors Natural scientists can investigate their subject under controlled experimental conditions, shielded from non-essential in uences Economists cannot do this; they cannot run the economy inside a building in which all seasonal variations of weather and scheduling are eliminated, in order to see... whenever there are more than negligible interactions between seasonal and economic mechanisms: • Leakages from seasonality to economics: The hot summers in the 1970s in Denmark caused a lot of investment into irrigation systems which then greatly changed agricultural technology [Hyl92, which page?] • Seasonality altering economic interactions: The building boom in Denmark in the 1970s caused seasonal labor... differencing twice: it is an ARIMA(0,2,2) model I.e., certain time series are such that differencing is the right thing to do But if a time series is the sum of a deterministic trend and white noise then differencing is not called for: From y t = y 0 + αt + εt follows ∆y t = α + εt − εt−1 This is not an invertible process The appropriate method of detrending here is to regress the timeseries on t and take . usually vague. The classic monograph [BJ76, p. 51] says, for instance: “The requirement of invertibility is needed if we are interested in asso ciating present events with past happ e nings in a sensible. q 2 , p 2 + q 1 )  . Box and Jenkins recommend to use the autocorrelations and partial autocorrelations for determining the order of the autoregressive or moving average parts, although this more. forecasting really easy, the AR-framework gives natural forecasts. One-step ahead forecasts by simply using present and past values of the time se ries and setting the future innovations zero, and in