Главная » Периодические издания » Common instrumental variables

1 2

common instrumental variables

This table summarizes variables used in the literature to predict stock returns. The first column indicates the published study. The second column denotes the lagged instrument. The next two columns give the sample (Period) and the number of observations (Obs) on the stock returns. Columns 5 and 6 report the autocorrelation (pZ) and the standard deviation of the instrument respectively. The next three columns report regression results for S&P 500 excess return on a lagged instrument. The slope coefficient is b,the t-statistic is t, and the coefficient of determination is R2.The last column (HAC) reports the method used in computing the standard errors of the slopes. The method of Newey-West (1987) is used with the number of lags given in parentheses. MA ( ) refers to the number of moving average terms used in the covariance matrix. The abbreviations in the table are as follows. TBly is the yield on the one-month Treasury bill. Two-one, Six-one, and Lag(two)-one are computed as the spreads on the returns of the two- and one-month bills, six- and one-month bills, and the lagged value of the two-month and current one-month bill. The yield on all corporate bonds is denoted as ALLy. The yield on AAA rated corporate bonds is AAAy, and UBAAy is the yield on corporate bonds with a below BAA rating. The variableCay is the linear function of consumption, asset wealth, and labor income. The book-to-market ratios for the Dow Jones Industrial Average and the S&P 500 are respectively DJBM and SPBM.

(1) Reference

(2) Predictor

(3) Period

(4) Obs

(5) Pz

(6) sz

(7) b


(9) R2

(10) HAC

Breen, Glosten, & Jagannathan (1989)





- 2.49

- 3.58



Campbell (1987)

















Lag(two) - one








Fama (1990)









Fama & French (1988a)

Dividend yield








Fama & French (1989)









Keim & Stambaugh (1986)

















Kothari & Shanken (1997)









Lettau & Ludvigson (2001)








Pontif & Schall (1998)


















of i

To incorporate data mining, we compile a randomly selected sample of 500 potential instruments, through which our simulated analyst sifts to mine the data for predictor variables. We select the 500 series randomly from a much larger sample of 10,866 potential variables. The specifics are described in the Appendix. Essentially, the procedure is to generate uniformly distributed random numbers, order the series from 1 to 10,866 and randomly extract 500 series. The 500 series are randomly ordered, and permanently assigned numbers between 1 and 500. When a data miner in our simulations searches through, say 50 series, we use the sampling properties of the 50 series to calibrate the parameters in the simulations.

We also use our sample of potential instruments to calibrate the parameters that govern the amount of persistence in the true expected returns in the model. On the one hand, if the instruments we see in the literature, summarized in Table I, arise from a spurious mining process, they are likely to be more highly autocorrelated than the underlying true expected stock return. On the other hand, if the instruments in the literature are a realistic representation of expected stock returns, the autocorrelations in Table I may be a good proxy for the persistence of the true expected returns.2 The mean autocorrelation of our 500 series is 15 percent and the median is 2 percent. Eleven of the 13 sample autocorrelations in Table I are higher than 15 percent, and the median value is 95 percent. We consider a range of values for the true autocorrelation based on these igures, as described below.

II. The Models

Consider a situation in which an analyst runs a time-series regression for the future stock return, rt+1, on a lagged predictor variable:

rt+1 = a + dZt + vt+1. (1)

The data are actually generated by an unobserved latent variable, Z*,as

rt+1 = m + Zt* + ut+1, (2)

where ut+1 is white noise with variance, a2u. We interpret the latent variable, Z* as the deviations of the conditional mean return from the unconditional mean, m, where the expectations are conditioned on an unobserved market information set at time t. The predictor variables follow an autoregressive process:

(Z*, Zt) = { r0* °j (Z* 1, Zt 1)+ (e*, st) (3)

2 There are good reasons to think that expected stock returns may be persistent. Asset pricing models like the consumption model of Lucas (1978) describe expected stock returns as functions of expected economic growth rates. Merton (1973) and Cox, Ingersoll, and Ross (1985) propose real interest rates as candidate state variables, driving expected returns in intertemporal models. Such variables are likely to be highly persistent. Empirical models for stock return dynamics frequently involve persistent, autoregressive expected returns (e.g., Conrad and Kaul (1988), Fama and French (1988b), Lo and MacKinlay (1988), or Huberman and Kandel (1990)).

The assumption that the true expected return is autoregressive follows previous studies such as Conrad and Kaul (1988), Fama and French (1988b), Lo and Mac-Kinlay (1988), and Huberman and Kandel (1990).

To generate the artificial data, the errors (e*, et) are drawn randomly as a normal vector with mean zero and covariance matrix, S. We build up the time-series of the Z and Z * through the vector autoregression equation (3), where the initial values are drawn from a normal with mean zero and variances,Var(Z) and Var(Z*). The other parameters that calibrate the simulations are {m, s2u, p, p *,and S}.

We have a situation in which the true returns may be predictable, if Z* could be observed.This is captured by the true R-squared,Var(Z*)/[Var(Z*) + a. We set Var(Z*) to equal the sample variance of the S&P 500 return, in excess of a one-month Treasury bill return, multiplied by 0.10. When the true R-squared of the simulation is 10 percent, the unconditional variance of the rt+1 that we generate is equal to the sample variance of the S&P 500 return, and the irst-order autocorrelation is similar to that of the actual data. When we choose other values for the true R-squared, these determine the values for the parameter a\l. We set to equal the sample mean excess return of the S&P 500 over the 1926 through 1998 period, or 0.71 percent per month.

The extent of the spurious regression bias depends on the parameters p and p *, which control the persistence of the measured and the true regressor. These values are determined by reference to Table I and from our sample of 500 potential instruments. The speciics difer across the special cases, as described below.

While the stock return could be predicted if Zt* could be observed, the analyst uses the measured instrument Zt. If the covariance matrix S is diagonal, Zt and Z* are independent, and the true value of d in the regression (1) is zero.

A. Pure Spurious Regression

To focus on spurious regression in isolation, we specialize equation (3) as follows. The covariance matrix S is a 2 x 2 diagonal matrix with variances (a*, a2). For a given value of p* the value of a* is determined as a* = (1 - p* )Var(Z*). The measured regressor has Var(Z) = Var(Z*). The autocorrelation parameters, p* = p are allowed to vary over a range of values. (We also allow p and p* to difer from one another, as described below.)

Following Granger and Newbold (1974), we interpret a spurious regression as one in which the t-ratios in regression (1) are likely to indicate a signiicant relation when the variables are really independent. The problem may come from the numerator or the denominator of the t-ratio: The coefficient or its standard error may be biased. As in Granger and Newbold, the problem lies with the standard errors.3 The reason is simple to understand. When the null

3 While Granger and Newbold (1974) do not study the slopes and standard errors to identify the separate efects, our simulations, designed to mimic their setting (not reported in the tables), conirm that their slopes are well behaved, while the standard errors are biased. Granger and Newbold use OLS standard errors, while we focus on the heteroskedasticity and autocorrelation-consistent standard errors that are more common in recent studies.

hypothesis that the regression slope d = 0 is true, the error term ut+1 of regression Equation (1) inherits autocorrelation from the dependent variable. Assuming sta-tionarity, the slope coefficient is consistent, but standard errors that do not account for the serial dependence correctly are biased.

Because the spurious regression problem is driven by biased estimates of the standard error, the choice of standard error estimator is crucial. In our simulation exercises, it is possible to ind an eicient unbiased estimator, since we know the true model that describes the regression error. Of course, this will not be known in practice. To mimic the practical reality, the analyst in our simulations uses the popular autocorrelation-heteroskedasticity-consistent (HAC) standard errors from Newey and West (1987), with an automatic lag selection procedure. The number of lags is chosen by computing the autocorrelations of the estimated residuals and truncating the lag length when the sample autocorrelations become insignificant at longer lags. (The exact procedure is described in Footnote 1, and modiications to this procedure are discussed below.)

This setting is related to Phillips (1986) and Stambaugh (1999). Phillips derives asymptotic distributions for the OLS estimators of the regression (1), in the case where p = 1, ut+1 = 0, and {e*, etg are general independent mean zero processes. We allow a nonzero variance of ut+1 to accommodate the large noise component of stock returns. We assume p<1 to focus on stationary, but possibly highly autocorrelated, regressors.

Stambaugh (1999) studies a case where the errors {e*, etg are perfectly correlated, or equivalently, the analyst observes and uses the correct lagged stochastic regressor. A bias arises when the correlation between ut+1 and et* 1 is not zero, related to the well-known small sample bias of the autocorrelation coeicient (e.g., Kendall (1954)). In the pure spurious regression case studied here, the observed regressor Zt is independent of the true regressor Z* ,and ut+1 is independent of et* 1. The Stambaugh bias is zero in this case. The point is that there remains a problem in predictive regressions, in the absence ofthe bias studied by Stambaugh, because of spurious regression.

B. Spurious Regression and Data Mining

We consider the interaction between spurious regression and data mining, where the instruments to be mined are independent as in Foster et al. (1997). There are L measured instruments over which the analyst searches for the best predictor, based on the R-squares of univariate regressions. In Equation (3) Zt becomes a vector of length L,whereL is the number of instruments through which the analyst sifts. The error terms (e*, et) become an L +1 vector with a diagonal covariance matrix; thus, e* is independent of et.

The persistence parameters in Equation (3) become an (L + 1)-square, diagonal matrix, with the autocorrelation of the true predictor equal to p *.The value of p * is either the average from our sample of 500 potential instruments, 15 percent, or the median value from Table I, 95 percent. The remaining autocorrelations, denoted by the L-vector p, are set equal to the autocorrelations of the irst L

instruments in our sample of 500 potential instruments, when p * = 15 percent. When p * = 95 percent, we rescale the autocorrelations to center the distribution at 0.95 while preserving the range in the original data.5 The simulations match the unconditional variances of the instruments, Var(Z), to the data. The irst element of the covariance matrix S is equal to a2*. For a typical ith diagonal element of S, denoted by ai,the elements of p(Zi) and Var(Zi) are given by the data, and we set a2 = [1 - p(Zi)2]Var(Zi).

III. Simulation Results

We irst consider spurious regression in isolation. Then we study spurious regression with data mining.

A. Pure Spurious Regression

Table II summarizes the results for the case of pure spurious regression. We record the estimated slope coeicient in regression (1), its Newey-West t-ratio, and the coeicient of determination at each trial and summarize their empirical distributions. The experiments are run for two sample sizes, based on the extremes in Table I. These are T = 66 and T = 824 in Panels A and B, respectively. In Panel C, we match the sample sizes to the studies in Table I. In each case, 10,000 trials of the simulation are run; 50,000 trials produces similar results.

The rows of Table II refer to diferent values for the true R-squares. The smallest value is 0.1 percent, where the stock return is essentially unpredictable, and the largest value is 15 percent. The columns of Table II correspond to diferent values of p *, the autocorrelation of the true expected return, which runs from 0.00 to 0.99. In these experiments, we set p = p *. The subpanels labeled Critical t-statistic and Critical estimated R2 report empirical critical values from the 10,000 simulated trials, so that 2.5 percent of the t-statistics or ive percent of the R-squares lie above these values.

The subpanels labeled Mean d report the average slope coefficients over the 10,000 trials. The mean estimated values are always small, and very close to the true value of zero at the larger sample size. This confirms that the slope coefficient estimators are well behaved, so that bias due to spurious regression comes from the standard errors.

4 We calibrate the true autocorrelations in the simulations to the sample autocorrelations, adjusted for first-order finite-sample bias as: p + (1 + 3p)/7,where p is the OLS estimate of the autocorrelation and T is the sample size.

5 The transformation is as follows. In the 500 instruments, the minimum bias-adjusted autocorrelation is - 0.571, the maximum is 0.999, and the median is 0.02. We center the transformed distribution about the median in Table I, which is 0.95. If the original autocorrelation p is less than the median, we transform it to

.95 + (p - 0.02){(0.95 + 0.571)/(0.02 + 0.571)}.

If the value is above the median, we transform it to

.95 + (p - 0.02){(0.999 - 0.95)/(0.999 - 0.02)}.

When p * = 0, and there is no persistence in the true expected return, the spurious regression phenomenon is not a concern. This is true even when the measured regressor is highly persistent. (We conirm this with additional simulations, not reported in the tables, where we set p * = 0andvaryp.) The logic is that when the slope in Equation (1) is zero and p * = 0, the regression error has no persistence, so the standard errors are well behaved. This implies that spurious regression is not a problem from the perspective of testing the null hypothesis that expected stock returns are unpredictable, even if a highly autocorrelated regressor is used.

Table II shows that spurious regression bias does not arise to any serious degree, provided p * is 0.90 or less, and the true R2 is one percent or less. For these parameters, the empirical critical values for the t-ratios are 2.48 (T = 66, Panel A), and 2.07 (T = 824, Panel B). The empirical critical R-squares are close to their theoretical values. For example, for a five percent test withT = 66 (824) the Fdistribution implies critical R-squared values of 5.9 percent (0.5 percent). The values in Table II when p * = 0.90 and true R2 = 1 percent, are 6.2 percent (0.5 percent); thus, the empirical distributions do not depart far from the standard rules of thumb.

Variables like short-term interest rates and dividend yields typically have first-order sample autocorrelations in excess of 0.95, as we saw in Table I. We find substantial biases when the regressors are highly persistent. Consider the plausible scenario with a sample of T = 824 observations where p = 0.98 and true R2 = 10 percent. In view of the spurious regression phenomenon, an analyst who was not sure that the correct instrument is being used and who wanted to conduct a 5 percent, two-tailed t-test for the significance of the measured instrument would have to use a t-ratio of 3.6. The coeicient of determination would have to exceed 2.2 percent to be significant at the 5 percent level. These cutoffs are substantially more stringent than the usual rules of thumb.

Panel C of Table II revisits the evidence from the literature in Table I. The critical values for the t-ratios and R-squares are reported, along with the theoretical critical values for the R-squares implied by the F distribution. We set the true R-squared value equal to 10 percent and p * = p in each case. We find that 7 of the 17 statistics in Table I that would be considered significant using the traditional standards, are no longer significant in view of the spurious regression bias.

While Panels A and B of Table II show that spurious regression can be a problem in stock return regressions, Panel C inds that accounting for spurious regression changes the inferences about speciic regressors that were found to be signiicant in previous studies. In particular, we question the signiicance of the term spread in Fama and French (1989), on the basis ofeither the t-ratio or the R-squared of the regression. Similarly, the book-to-market ratio of the Dow Jones index, studied by Pontif and Schall (1998) fails to be signiicant with either statistic. Several other variables are marginal, failing on the basis of one but not both statistics. These include the short-term interest rate (Fama and Schwert (1977), using the more recent sample of Breen, Glosten, and Jagannathan (1989)), the dividend yield (Fama and French (1988a)), and the quality-related

yield spread (Keim and Stambaugh (1986)). All of these regressors would be considered signiicant using the standard cutofs.

It is interesting to note that the biases documented in Table II do not always diminish with larger sample sizes; in fact, the critical t-ratios are larger in the lower right corner of the panels when T = 824 than when T = 66. The mean values of the slope coeicients are closer to zero at the larger sample size, so the larger critical values are driven by the standard errors. A sample as large as T = 824 is not by itself a cure for the spurious regression bias. This is typical of spurious regression with a unit root, as discussed by Phillips (1986) for ininite sample sizes and nonstationary data.6 It is interesting to observe similar patterns, even with stationary data and inite samples.

Phillips (1986) shows that the sample autocorrelation in the regression studied by Granger and Newbold (1974) converges in limit to 1.0. However, we ind only mildly inflated residual autocorrelations (not reported in the tables) for stock return samples as large as T = 2000, even when we assume values of the true R2 as large as 40 percent. Even in these extreme cases, none of the empirical critical values for the residual autocorrelations are larger than 0.5. Since ut+1 = 0in the cases studied by Phillips, we expect to see explosive autocorrelations only when the true R2 is very large. When R2 is small, the white noise component of the returns serves to dampen the residual autocorrelation. Thus, we are not likely to see large residual autocorrelations in asset pricing models, even where spurious regression is a problem. The residuals-based diagnostics for spurious regression, such as the Durbin-Watson tests suggested by Granger and Newbold, are not likely to be very powerful in asset pricing regressions. For the same reason, naive application of the Newey-West procedure, where the number of lags is selected by examining the residual autocorrelations, is not likely to resolve the spurious regression problem.

Newey and West (1987) show that their procedure is consistent when the number of lags used grows without bound as the sample size T increases, provided that the number of lags grows no faster than T14. The lag selection procedure in Table II examines 12 lags. Even though no more than nine lags are selected for the actual data in Table I, more lags would sometimes be selected in the simulations, and an inconsistency results from truncating the lag length.7 However, in inite samples, an increase in the number of lags can make things worse. When too many lags are used, the standard error estimates become excessively noisy, which thickens the tails of the sampling distribution of the t-ratios. This occurs

6 Phillips derives asymptotic distributions for the OLS estimators of equation (1), in the case where p = 1, ut+1 = 0. He shows that the t-ratio for d diverges for large T, while t(<5)/vT, d, and the coeicient of determination converge to well-deined random variables. Marmol (1998) extends these results to multiple regressions with partially integrated processes, and provides references to more recent theoretical literature. Phillips (1998) reviews analytical tools for asymptotic analysis when nonstationary series are involved.

7 At very large sample sizes, a huge number of lags can control the bias. We verify this by examining samples as large as T = 5000, letting the number of lags grow to 240. With 240 lags, the critical t-ratio when the true R2 = 10 percent and p = 0.98 falls from 3.6 in Panel B of Table II to a reasonably well-behaved value of 2.23.

Table II

The Monte Carlo Simulation Results for Regressions with a Lagged

Predictor Variable

The table reports the 97.5 percentile of the Monte Carlo distribution of 10,000 Newey-West t-statistics, the 95 percentile for the estimated coefficients of determination, and the average estimated slopes from the regression

rt+1 = a + dZt + Vt+1,

where rt+1 is the excess return, Zt is the predictor variable, and t = 1,...,T. The parameter p* is the autocorrelation coefficient of the predictors, Zt * and Zt.The R2 is the coefficient of determination from the regression of excess returns rt+1 on the unobserved, true instrument Zt *.Panel A depicts the results for T = 66 and Panel B for T = 824. Panel C gives the simulation results for the number of observations and the autocorrelations in Table I. In Panel C, the true R2 is set to 0.1. The theoretical critical R2 is from the F-distribution.

Panel A: 66 Observations

R2/p *




Mean d

0.001 0.005 0.010 0.050 0.100 0.150

- 0.0480

- 0.0207

- 0.0142

- 0.0055

- 0.0033

- 0.0024

- 0.0554

- 0.0246

- 0.0173

- 0.0075

- 0.0051

- 0.0040

- 0.0154 -

- 0.0074 -

- 0.0055 -

- 0.0029 -

- 0.0023 -

- 0.0020 -

0.0179 0.0088 0.0066 0.0037 0.0030 0.0026

- 0.0312

- 0.0137

- 0.0096

- 0.0040

- 0.0026

- 0.0020

- 0.0463

- 0.0193

- 0.0129

- 0.0042

- 0.0021

- 0.0012

Critical t-statistic

0.001 0.005 0.010 0.050 0.100 0.150

2.1951 2.2033 2.2121 2.2609 2.2847 2.2750

2.3073 2.3076 2.3123 2.3335 2.3702 2.3959

2.4502 2.4532 2.4828 2.6403 2.8408 3.0046

2.4879 2.5007 2.5369 2.7113 2.9329 3.1232

2.4746 2.5302 2.5460 2.7116 2.9043 3.0930

2.4630 2.5003 2.5214 2.6359 2.7843 2.9417

Critical estimated R2

0.001 0.005 0.010 0.050 0.100 0.150

0.0593 0.0590 0.0590 0.0593 0.0600 0.0600

0.0575 0.0578 0.0579 0.0593 0.0622 0.0649


0.0608 0.0619 0.0715 0.0847 0.0994

0.0599 0.0607 0.0623 0.0737 0.0882 0.1032

0.0610 0.0616 0.0630 0.0703 0.0823 0.0942

0.0600 0.0604 0.0612 0.0673 0.0766 0.0850

Panel B: 824 Observations

Mean d

0.001 0.005 0.010 0.050 0.100 0.150

0.0150 0.0067 0.0048 0.0021 0.0015 0.0012

0.0106 0.0049 0.0035 0.0017 0.0013 0.0011

0.0141 0.0069 0.0052 0.0029 0.0023 0.0021

0.0115 0.0055 0.0040 0.0021 0.0016 0.0014

0.0053 0.0021 0.0014 0.0003 0.0001 0.0000

- 0.0007

- 0.0011

- 0.0012

- 0.0014

- 0.0014 0.0014

Panel B: 824 Observations


0.9 0.95



Critical t-statistic




2.0362 2.0454






2.0429 2.1123






2.0655 2.1479






2.2587 2.5685






2.3758 2.7342






2.4164 2.8555



Critical estimated R2




0.0047 0.0047






0.0048 0.0051






0.0050 0.0054






0.0066 0.0085






0.0084 0.0125






0.0104 0.0166



Panel C: Table I simulation

p * Critical Theoretical R2 Critical t-statistic

Critical Estimated R2





















































for the experiments in Table II. For example, letting the procedure examine 36 autocorrelations to determine the lag length (the largest number we ind mentioned in published studies), the critical t-ratio in Panel A, for true R2 = 10 percent and p * = 0.98, increases from 2.9 to 4.8. Nine of the 17 statistics from Table I that are signiicant by the usual rules of thumb now become insigniicant. The results calling these studies into question are even stronger than before. Thus, simply increasing the number of lags in the Newey-West procedure does not resolve the inite sample, spurious regression bias.8

8 We conduct several experiments letting the number of lags examined be 24, 36, or 48, when T = 66 and T = 824. When T = 66, the critical t-ratios are always larger than the values in Table II. When T = 824, the effects are small and of mixed sign. The most extreme reduction in a critical t-ratio, relative to Table II, is with 48 lags, true R2 = 15 percent, and p * = 0.99, where the critical value falls from 4.92 to 4.23.

Table II- Continued

We draw several conclusions about spurious regression in stock return regressions. Given persistent expected returns, spurious regression can be a serious concern well outside the classic setting of Yule (1926) and Granger and Newbold (1974). Stock returns, as the dependent variable, are much less persistent than the levels of most economic time series.Yet, when the expected returns are persistent, there is a risk of spurious regression bias. The regression residuals may not be highly autocorrelated, even when spurious regression bias is severe. Given inconsistent standard errors, spurious regression bias is not avoided with large samples. Accounting for spurious regression bias, we ind that 7 of the 17 t-statistics and regression R-squares from previous studies that would be signiicant by standard criteria are no longer signiicant.

B. Spurious Regression and Data Mining

We now consider the interaction between spurious regression and data mining. Table III summarizes the results. The columns of Panels A through D correspond to different numbers of potential instruments, through which the analyst sifts to ind the regression that delivers the highest sample R-squared. The rows refer to the diferent values of the true R-squared.

The cases with true R2 = 0 refer to data mining only, similar to Foster et al. (1997). The columns where L = 1 correspond to pure spurious regression bias. We hold fixed the persistence parameter for the true expected return, p *, while allowing p to vary depending on the measured instrument. When L = 1, we set p = 15 percent. We consider two values for p*, 15 percent or 95 percent.

Panels A and B of Table III show that when L = 1and p * = 15 percent, there is no data mining, and, consistent with Table II, there is no spurious regression problem. The empirical critical values for the t-ratios and R-squared statistics are close to their theoretical values under normality. For larger values of L and p * = 15 percent, there is data mining, and the critical values are close to the values reported by Foster et al. (1997) for similar sample sizes.9 There is little difference in the results for the various true R-squares. Thus, with little persistence, there is no spurious regression problem, and no interaction with data mining.

Panels C and D of Table III tell a different story. When the underlying expected return is persistent (p * = 0.95), there is a spurious regression bias. When L = 1, we have spurious regression only. The critical t-ratio in Panel C increases from 2.3 to 2.8asthetrueR-squared goes from 0 to 15 percent. The bias is less pronounced here than in Table II, with p = p * = 0.95, which illustrates that for a given value of p *, spurious regression is worse for larger values of p.

Spurious regression bias interacts with data mining. Consider the extreme corners of Panel C. Whereas with L = 1, the critical t-ratio increases from 2.3 to 2.8 as the true R-squared goes from 0 to 15 percent, with L = 250, the critical t-ratio increases from 5.2 to 6.3 as the true R-squared is increased. Thus, data mining magniies the efects of the spurious regression bias. When more instruments

9 Our sample sizes, T, are not the same as in Foster et al. (1997). When we run the experiments for their sample sizes, we closely approximate the critical values that they report.

1 2
© 2003 GARUN.RU.
Копирование материалов запрещено.