Class Notes in Statistics and Econometrics Part 25 pptx

CHAPTER 49 Distributed Lags In the simplest case of one explanatory variable only, the model is (49.0.1) y t = α + β 0 x t + β 1 x t−1 + ··· + β N x t−N + ε t This can be written in the form (49.0.2) y = Xβ + ε ε ε where X =      1 x 1 x 0 ··· x 1−N 1 x 2 x 1 ··· x 2−N . . . . . . . . . 1 x n x n−1 ··· x n−N      . Note that X contains presample values. 1051 1052 49. DISTRIBUTED LAGS Two problems: lag length often not known, and X matrix often highly multicollinear. How to determine lag length? Sometimes it is done by the adjusted ¯ R 2 . [Mad88, p. 357] says this will lead to too long lags and proposes remedies. Assume we know for sure that lag length is not greater than M. [JHG + 88, pp. 723–727] recommends the following “general-to-specific” specification procedure for finding the lag length: First run the regression with M lags; if the t-test for the parameter of the M th lag is significant, we say the lag length is M . If it is insignificant, run the regression with M −1 lags and test again for the last coe fficie nt: If the t-test for the parameter of the M −1st coefficient is significant, we say the lag length is M − 1, etc. The significance level of this test depends on M and on the true lag length. Since we never know the true lag length for sure, we will never know the true significance level for sure. The calculation which follows now allows us to compute this significance level under the assumption that the N given by the te st is the correct N. Furthermore this calculation only gives us the one-sided significance level: the null hypothesis is not that the true lag length is = N, but that the true lag length is ≤ N. Assume the null hypothesis is true, i.e., that the true lag length is ≤ N. Since we assume we know for sure that the true lag length is ≤ M , the null hypothesis 49. DISTRIBUTED LAGS 1053 is equivalent to: β N+1 = β N+2 = ··· = β M = 0. Now assume that we apply the above procedure and the null hypothesis holds. The significance level of our test is the probability that our procedure rejects the null although the null is true. In other words, it is the probability that either the first t-test rejects, or the first t-test accepts and the second t-test rejec ts, or the first two t-tests acce pt and the third t-test rejects, etc, all under the assumption that the true β i are zero. In all, the lag length is overstated if at least one of the M −N t-tests rejects. Therefore if we define the event C i to be the rejection of the ith t-test, and define Q j = C 1 ∪···∪C j , then Pr[Q j ] = Pr[Q j−1 ∪ C j ] = Pr[Q j−1 ] + Pr[C j ] − Pr[Q j−1 ∩ C j ]. [JHG + 88, p. 724] says, and a proof can be found in [And66] or [And71, pp. 34–43], that the test statistics of the different t-tests are independent of each other. Therefore one can write Pr[Q j ] = Pr[Q j−1 ] + Pr[C j ] − Pr[Q j−1 ] Pr[C j ]. Examples: Assuming all t-tests are carried out at the 5% significance level, and two tests are insiginficant before the first rejection occurs. I.e., the test indicates that the true lag length is ≤ M − 2. Assuming that the true lag length is indeed ≤ M −2, the probability of falsely rejecting the hypothesis that the Mth and M −1st lags are zero is 0.05 + 0.05 −0.05 2 = 0.1 −0.0025 = 0.0975. For three and four tests the levels are 0.1426 and 0.1855. For 1% significance level and two tests it would be 0.01 + 0.01 − 0.01 2 = 0.0200 −0.0001 = 0.0199. For 1% significance level and three tests it would be 0.0199 + 0.01 −0.000199 = 0.029701. 1054 49. DISTRIBUTED LAGS Problem 451. Here are excerpts from SAS outputs, estimating a consumption function. The dependent variable is always the same, GCN72, the quarterly personal consumption expenditure for nondurable goods, in 1972 constant dollars, 1948–1985. The explanatory variable is GYD72, personal income in 1972 constant dollars (deflated by the price deflator for nondurable goods), lagged 0–8 quarters. PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 65.61238269 0.88771664 73.911 0.0001 GYD72 1 0.13058204 0.000550592 237.167 0.0001 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 65.80966177 0.85890869 76.620 0.0001 GYD72 1 0.07778248 0.01551323 5.014 0.0001 GYD72L1 1 0.05312929 0.01560094 3.406 0.0009 • a. 3 points Make a sequential test how long you would like to have the lag length. 49. DISTRIBUTED LAGS 1055 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 65.87672382 0.84399982 78.053 0.0001 GYD72 1 0.08289905 0.01537243 5.393 0.0001 GYD72L1 1 0.008943833 0.02335691 0.383 0.7023 GYD72L2 1 0.03932710 0.01569029 2.506 0.0133 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 65.99593829 0.82873058 79.635 0.0001 GYD72 1 0.08397167 0.01507688 5.570 0.0001 GYD72L1 1 0.01413009 0.02298584 0.615 0.5397 GYD72L2 1 -0.007354543 0.02363040 -0.311 0.7561 GYD72L3 1 0.04063255 0.01561334 2.602 0.0102 Answer. If all tests are made at 5% significance level, reject that there are 8 or 7 lags, and go with 6 lags.  • b. 5 points What is the probability of type I error of the test you just described? Answer. For this use the fact that the t-statistics are independent. There is a 5% probability of incorrectly rejecting the first t-test and also a 5% probability of incorrectly rejecting the second 1056 49. DISTRIBUTED LAGS PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 66.07544717 0.80366736 82.217 0.0001 GYD72 1 0.09710692 0.01518481 6.395 0.0001 GYD72L1 1 -0.000042518 0.02272008 -0.002 0.9985 GYD72L2 1 0.001564270 0.02307528 0.068 0.9460 GYD72L3 1 -0.01713777 0.02362498 -0.725 0.4694 GYD72L4 1 0.05010149 0.01573309 3.184 0.0018 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 66.15803761 0.78586731 84.185 0.0001 GYD72 1 0.09189381 0.01495668 6.144 0.0001 GYD72L1 1 0.01675422 0.02301415 0.728 0.4678 GYD72L2 1 -0.01061389 0.02297260 -0.462 0.6448 GYD72L3 1 -0.008377491 0.02330072 -0.360 0.7197 GYD72L4 1 -0.000826189 0.02396660 -0.034 0.9725 GYD72L5 1 0.04296552 0.01551164 2.770 0.0064 49. DISTRIBUTED LAGS 1057 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 66.22787177 0.77701222 85.234 0.0001 GYD72 1 0.08495948 0.01513456 5.614 0.0001 GYD72L1 1 0.02081719 0.02281536 0.912 0.3631 GYD72L2 1 0.001067395 0.02335633 0.046 0.9636 GYD72L3 1 -0.01567316 0.02327465 -0.673 0.5018 GYD72L4 1 0.003008501 0.02374452 0.127 0.8994 GYD72L5 1 0.004766535 0.02369258 0.201 0.8408 GYD72L6 1 0.03304355 0.01563169 2.114 0.0363 t-test. The probability of incorrectly rejecting at least one of the two tests is therefore 0.05+ 0.05 − 0.05 · 0.05 = 0.1 − 0.0025 = 0.0975. For 1% it is (for two tests) 0.01 + 0.01 − 0.01 · 0.01 = 0.0199, but three tests will be necessary!  • c. 3 points Which common problem of an estimation with lagged explanatory variables is apparent from this printout? What would be possible remedies for this problem? Answer. The explanatory variables are highly multicollinear, therefore use Almon lags or something similar. Another type of pro blem is: increase of type I errors with increasin g number of steps, start with small significance levels!  1058 49. DISTRIBUTED LAGS PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 66.29560292 0.77062598 86.028 0.0001 GYD72 1 0.08686483 0.01502757 5.780 0.0001 GYD72L1 1 0.01258635 0.02301437 0.547 0.5853 GYD72L2 1 0.004612589 0.02321459 0.199 0.8428 GYD72L3 1 -0.005511693 0.02366979 -0.233 0.8162 GYD72L4 1 -0.002789862 0.02372100 -0.118 0.9065 GYD72L5 1 0.008280160 0.02354535 0.352 0.7256 GYD72L6 1 -0.001408690 0.02383478 -0.059 0.9530 GYD72L7 1 0.02951031 0.01551907 1.902 0.0593 Secondly: what to do about multicollinearity? Prior information tells you that the true lag coefficients probably do not go in zigzag, but follow a smooth curve. This information can be incorporated into the model by pre-selecting a family of possible lag contours from which that should be chosen that fits best, i.e. by doing constrained least squares. The simplest such assumption is that the lag coefficients lie on a polynomial or degree d (polynomial distributed lags, often called Almon lags). Since linear combinations of polynomials are again polynomials, this restricts the β vectors one has to choose from to a subspace of k-dimensional space. 49. DISTRIBUTED LAGS 1059 PARAMETER STANDARD T FOR H0: VARIABLE DF ESTIMATE ERROR PARAMETER=0 PROB > |T| INTERCEP 1 66.36142439 0.77075066 86.100 0.0001 GYD72 1 0.08619326 0.01500496 5.744 0.0001 GYD72L1 1 0.01541463 0.02307449 0.668 0.5052 GYD72L2 1 -0.002721499 0.02388376 -0.114 0.9094 GYD72L3 1 -0.001837498 0.02379826 -0.077 0.9386 GYD72L4 1 0.003802403 0.02424060 0.157 0.8756 GYD72L5 1 0.004328457 0.02370310 0.183 0.8554 GYD72L6 1 0.000718960 0.02384368 0.030 0.9760 GYD72L7 1 0.006305240 0.02404827 0.262 0.7936 GYD72L8 1 0.02002826 0.01587971 1.261 0.2094 Usually this is done by the imposition of linear constraints. One might explicitly write it as linear constraints of the form Rβ = o, since polynomials of dth order are characterized by the fact that the dth difference s of the coefficients are constant, or their d + 1st differences zero. (This gives one linear constraint for every position in β for which the dth difference can be computed.) But here it is more convenient to incorporate these restrictions into the regression equation and in this way end up with a regression with fewer explanatory variables. 1060 49. DISTRIBUTED LAGS Any β with a polynomial lag structure has the form β = Hα for the (d + 1) × 1 vector α, where the columns of H simply are polynomials: (49.0.3)       β 0 β 1 β 2 β 3 β 4       =       1 0 0 0 1 1 1 1 1 2 4 8 1 3 9 27 1 4 16 64           α 0 α 1 α 2 α 3     More examples for such H-matrices are in [JHG + 88, p. 730]. Then the specification y = Xβ + ε ε ε become s y = XHα + ε ε ε. I.e., one estimates the coefficients of α by an ordinary regression again, and even in the presence of polynomial distributed lags one can use the ordinary F -test, impose other linear constraints, do “GLS” in the usual way, etc. (SAS allows for an autoregressive error structure in addition to the lags). The pdlreg procedure in SAS also uses a H whose first column contains a zero order polynomial, the second a first order polynomial, etc. But it does not use these exact polynomials shown above but chooses the polynomials in such a way that they are orthogonal to each other. The elements of α are called X**0 (coefficient of the zero order polynomial), X**1, etc. [...]... inventory valuation adjustment + noncorporate income + noncorporate inventory valuation adjustment + government subsidies + net interest Denominator: capital stock + inventories (the inventories come in part from the NIPAs, in part from the census) It also has the prime rate (short term lending interest rate), and the 10 year treasury note interest rate, and the consumer price index Note that the interest... : Apparel and Related Products Primary Metal Industri    ∗ 24 :  34 : Lumber and Products Fabricated Metal Produ    25 :  35 : Furniture and Fixtures∗ Machinery (Except Electr    26 :  36 : Paper and Allied Products Electrical Equipment    27 : Printing and Publishing Industries∗∗  37 : Transportation Equipme    28 : 38 : Instruments and Related Pr Chemicals and Allied... their actual and desired capital stock In other words, although Jorgenson claims to be modelling a very neoclassical forward-looking optimizing behavior, he ands up estimating an equation in which firms start with the situation they are in and go from there 1078 50 INVESTMENT MODELS ∗ ∗ ∗ The investment orders placed at time t are assumed to be Kt − Kt−1 = ∆Kt Now Jorgenson makes the following assumptions... Petroleum and Coal Products 39 : Miscellaneous Manufactu The main data are collected in two datafiles: ec781.invcur has data about fixed nonresidential private investment, capital stock (net of capital consumption allowance), and gross national product by industry in current dollars The file ec781.invcon has the corresponding data in constant 1982 dollars (has missing values for some data for industry... ˆ ˆ and β by applying OLS to the equation (49.2.13) y t = α0 + β0 xt + λy t−1 + η t and then setting (49.2.14) α= ˆ α0 ˆ ˆ 1−λ and ˆ β= ˆ β0 ˆ 1−λ Answer OLS is inconsistent because y t−1 and εt−1 , therefore also y t−1 and η t are correlated (It is also true that η t−1 and η t are correlated, but this is not the reason of the inconsistency) 1068 49 DISTRIBUTED LAGS • d 1 point In order to get an... both This may prevent, for instance, the last lagged coefficient from becomeing negative if all the others are positive But experience shows that in many cases such endpoint restrictions are not a good idea Alternative specifications of the lag coefficients: Shiller lag: In 1973, long before smoothing splines became popular, Shiller in [Shi73] proposed a joint minimization of SSE and k times the squared sum... the investment function along the lines suggested by Jorgenson 50.3 INVESTMENT FUNCTION PROJECT 1081 50.3 Investment Function Project We will work with annual data for the 2-digit SIC manufacturing industries, which are the following: (50.3.1)    20 : Food and Kindred Products 30 : Rubber Products 21 :  31 : Tobacco Manufactures∗ Leather and Leather Prod    22 :  32 : Stone, Clay, and. .. stock of your industry against value added, and also plot the first differences against each other Interpret your results 1073 1074 50 INVESTMENT MODELS Now the flexible accelerator has the following two basic equations: ∗ Kt = aQt (50.1.2) (50.1.3) ∗ Kt − Kt−1 = (1 − γ)(Kt − Kt−1 ) This can either be used to generate a relation between capital stock and output, or a relation between investment and output... procedure for all values along a grid from 0 to 1 and then pick the value of λ which gives the best SSE Zellner and Geisel did this in [ZG70], and their regression can be reproduced in R with the commands data(geizel) and then plot((1:99)/100, geizel.regression(geizel$c, geizel$y, 99), xlab="lambda", ylab="sse") They got two local minima for λ, and that local minimum which was smaller corresponded to a β... 27, that is why industry has a double star) Furthermore, the dataset 1082 50 INVESTMENT MODELS ec781.invmisc has additional data which might be interesting Among those are capacity utilization data for the industries 20, 22, 26, 28, 29, 30, 32, 33, 34, 35, 36, 37, and 38 (all industries for which there are no capacity utilization data have at least one star) and profit rates for all industries The profit . use the fact that the t -statistics are independent. There is a 5% probability of incorrectly rejecting the first t-test and also a 5% probability of incorrectly rejecting the second 1056 49. DISTRIBUTED. one linear constraint for every position in β for which the dth difference can be computed.) But here it is more convenient to incorporate these restrictions into the regression equation and in. specifications of the lag coefficients: Shiller lag: In 1973, long before smoothing splines became popular, Shiller in [Shi73] proposed a joint minimization of SSE and k times the squared sum of d+1st differences