温馨提示：定期 清理浏览器缓存，可以获得最佳浏览体验。

提要：实证研究中，经常需要控制那些不可观测的异质性特征，比如行业层面个体差异或行业层面的冲击 (通常会在模型中加入一组行业虚拟变量)。本文探讨了文献中普遍使用的两种方法，一种是组内去心(demean)，就是把模型中的解释变量和被解释变量都减掉它所在行业的平均值后再进行回归；另一种是在模型中加入被解释变量的行业均值作为控制变量。我们的研究表明，上述两种方法得到的这个估计结果都是非一致的，从而扭曲我们的统计推断。**相对而言，固定效应估计量 (Fixed Effects Estimator) 的结果是一致的，应该在实证分析中使用。**我们还进一步的解释了当传统的计算方法不可行时，我们该如何高计算效率。

Abstract:Controlling for unobserved heterogeneity (or “common errors”), such as industry-specific shocks, is a fundamental challenge in empirical research. This paper discusses the limitations of two approaches widely used in corporate finance and asset pricing research: demeaning the dependent variable with respect to the group (e.g., “industry-adjusting”) and adding the mean of the group’s dependent variable as a control. We show that these methods produce inconsistent estimates and can distort inference. In contrast, the fixed effects estimator is consistent and should be used instead. We also explain how to estimate the fixed effects model when traditional methods are computationally infeasible. (JEL G12, G2, G3, C01, C13)

Source:Gormley, T. A., D. A. Matsa, 2014, Common errors: How to (and not to) control for unobserved heterogeneity,Review of Financial Studies, 27 (2): 617-661. PDF原文, Stata 实现过程

**目录**

Our purpose in writing paper Gormley, T. A., D. A. Matsa, 2014, Common errors: How to (and not to) control for unobserved heterogeneity, Review of Financial Studies, 27 (2): 617-661. | ( 下载 PDF-1 | 下载 PDF-2 ) was to examine the econometrics underlying the ad hoc estimation methods commonly used to account for unobserved heterogeneity in the finance literature. Our investigation found that these methods should not be used because they typically provide inconsistent estimates. For example, *Adj*Y estimation transforms only the dependent variable and does not remove problematic correlations from the independent variables. Fixed effects (FE) estimation, on the other hand, is consistent and should be used in place of these other estimators. But it is not always obvious how to implement fixed effects.

This website provides examples and corresponding code to illustrate how to implement fixed effects in these cases. We also provide suggestions on how to overcome computational hurdles that arise when estimating models with multiple high-dimensional fixed effects. The code we provide is for Stata and SAS. If you want to suggest ways to handle these issues in other languages, we are happy to post links.

If you use this information or code, please cite Gormley and Matsa (*RFS *2014). Our paper, which provides deeper analysis of these ideas, is available here. Lecture slides used by Gormley to teach these methods to PhD students are available here.

A FE estimator correctly transforms both the dependent and independent variables and should be used in place of *Adj*Y and *Avg*E estimators. Commands for implementing the FE estimator in Stata are in bold and the variable names, which the user must specify, are in italics. Here are two examples: (1) industry-adjusting and (2) characteristically-adjusting stock returns.

Industry-adjusting, an *Adj*Y estimator, can take many forms. A common form is to demean the dependent variable with respect to industry mean (or median) before estimating the model with OLS. However, this estimate is inconsistent whenever there are within-industry correlations among independent variables. Instead, a researcher should estimate a model with industry FE. Any of the following four sets of estimation commands can be used:

```
*-Stata
. reg dependent_variable independent_variables i.industry
. areg dependent_variable independent_variables, a(industry)
. xtset industry
. xtreg dependent_variable independent_variables, fe
```

**Note #1:** Unless you are interested in the individual group means, `areg`

, `xtreg`

are typically preferable, because of shorter computation times.
**Note #2:** While these various methods yield identical coefficients, the standard errors may differ when Stata’s `cluster`

option is used. When clustering, `areg`

reports cluster-robust standard errors that reduce the degrees of freedom by the number of fixed effects swept away in the within-group transformation; `xtreg`

reports smaller cluster-robust standard errors because it does not make such an adjustment. `xtreg`

’s approach of not adjusting the degrees of freedom is appropriate when the fixed effects swept away by the within-group transformation are nested within clusters (meaning all the observations for any given group are in the same cluster), as is commonly the case (e.g., firm fixed effects are nested within firm, industry, or state clusters). See Wooldridge (2010, Chapter 20).

XTREG-clustered standard errors can be recovered from AREG as follows:

1. Run the AREG command *without* clustering
2. Then, construct two variables using the following code:

```
. gen df_areg = e(N) – e(rank) – e(df_a)
. gen df_xtreg = e(N) – e(rank)
```

3. Run the `areg`

command again *with* clustering
4. Multiply the reported cluster-robust standard errors by `sqrt(df_areg / df_xtreg)`

If the desired industry-adjusting is on a yearly basis, then instead of using the mean or median of observations in the same industry-year to adjust the dependent variable, estimate a model with *industry×year* fixed effects:

```
. reg dependent_variable independent_variables i.industry#i.year
. egen industry_year = group(industry year)
. areg dependent_variable independent_variables, a(industry_year)
. egen industry_year = group(industry year)
. xtset industry_year
. xtreg dependent_variable independent_variables, fe
```

If you are interested in combining industry-year FE with another fixed effect, like firm FE, then absorb the fixed effect of highest dimension and control for the other(s) using indicator variables:

```
. areg dependent_variable independent_variables i.industry#i.year, a(firm)
. xtset firm
. xtreg dependent_variable independent_variables i.industry#i.year, fe
```

**Note:** The above specification may be computationally difficult to estimate if the number of industry-year indicator variables is large. To resolve this, please see the discussion below about Stata programs that can be used to estimate models with multiple high-dimensional FE.

Although there are many ways to construct characteristically-adjusted stock returns, the basic idea is the same. Before analyzing stock returns, you first construct a set of benchmark portfolios based on various firm characteristics, and then “characteristically-adjust” the individual stock returns by subtracting the equal- or value-weighted average return of their corresponding benchmark portfolio for each period. For example, construct 25 size and value portfolios each period by first dividing stocks into quintiles based on their size and then further subdividing them into quintiles based on their market-to-book ratios. A firm’s size and market-to-book ratio in a give