倍分法DID详解 (一):传统 DID

发布时间:2019-12-09 阅读 516

作者:王昆仑 (天津大学)
E-mail: shawn0513@163.com

Stata连享会   主页 || 视频 || 推文

连享会 - Stata 暑期班

线上直播 9 天:2020.7.28-8.7
主讲嘉宾:连玉君 (中山大学) | 江艇 (中国人民大学)
课程主页https://gitee.com/arlionn/PX | 微信版


目录


导入

本文来自 连享会专题 - 倍分法 DID

双重差分模型 (Difference-Differences, DID)是政策评估的非实验方法中最为常用的一种方法。本系列推文的目的是,利用模拟的方式直观的展示 Standard DID,Time-varying DID 以及其于Event Study Approach (ESA) 的结合应用。

1. 引言

关于 DID 方法的推文已经不少了,仅连享会就推送过多篇专题文章,还有利用 margins 命令来计算 DID 模型的边际效果的推文等等,详见 连享会专题 - 倍分法 DID

另外,李春涛老师的爬虫俱乐部公众号也用两期的篇幅来对 DID 模型和其平行趋势的检验做了简要的介绍。对于 DID 方法可能还不熟悉的同学,可以结合 「陈强老师的推文」和参考资料里提到的内容一起学习。

与上述的文章相比,连享会此次推出的 倍分法系列推文 有以下三个特色:

  • 第一,本系列将上述的 DID 模型的设置,平行趋势的检验和图形表示以及边际效果等一般化的流程尽可能地在一篇推文中展示出来,为读者作为参考;
  • 第二,本系列利用同一套的模拟数据,将 Standard DID 以及 Time-varying DID 结合起来, 分析其方程设定的异同;
  • 第三,在实际的操作过程中,Time-varying DID 利用 ESA 方法对平行趋势进行检验时,由于可能存在不是所有个体都最终接受了政策干预,因此,这两种情况下,ESA 方程的设定和代码写作存在差异,这也是本系列推文所着重关心的问题。

2. 模拟数据的生成

我们生成了一份模拟数据,结构为 60个体*10年,共 600 个观测值的平衡面板数据。我们在数据中设定 2005 年为政策发生当年,id 取值在 30-60 范围内的个体为处理组,剩余的个体为控制组。我们将基础的文件保存为 DID_Basic_Simu.dta (Stata 格式的数据文件),方便随后调用。

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///设定60个观测值,设定随机数种子
clear allset obs 60 set seed 10101gen id =_n
/// 每一个数值的数量扩大11倍,再减去前六十个观测值,即60*11-60 = 600,为60个体10年的面板数据
expand 11drop in 1/60count
///以id分组生成时间标识
bys id: gen time = _n+1999xtset id time
///生成协变量x1, x2
gen x1 = rnormal(1,7)gen x2 = rnormal(2,5)
///生成个体固定效应和时间固定效应
sort time idby time: gen ind = _nsort id timeby id: gen T = _n
//生成treat和post变量,以2005年为接受政策干预的时点,id为30-60的个体为处理组,其余为控制组
gen D = 0 replace D = 1 if id > 29gen post = 0replace post = 1 if time >= 2005
///将基础数据结构保存成dta文件,命名为DID_Basic_Simu.dta, 默认保存在当前的 working directory 路径下
save "DID_Basic_Simu.dta",replace

3. 政策效果不随时间而变的 Standard DID 的模拟

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10,且不随时间发生变化,直到最后一期
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe rpredict e,uebinscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600graph export "article1_1.png",as(png) replace width(800) height(600)
图1
图1

利用双向固定效应 (two-way fixed effects) 的方法估计交互项的系数,其中用到 reg, xtreg, areg, reghdfe 等多个 Stata 命令,四个命令的比较放在了下方的表格中。在本文中,主要展示 reghdfe 命令的输出结果,该命令的具体介绍可以参考 「Stata: reghdfe-多维固定效应」

reg xtreg areg reghdfe
个体固定效应 i.id xtreg,fe areg,absorb(id) absorb(id time)
时间固定效应 i.time i.time i.time absorb(id time)
估计方法 OLS 组内去平均后OLS OLS OLS
优点 命令熟悉,逻辑清晰 固定效应模型的官方命令 官方命令,可以提高组别不随样本规模增加的估计效率 高维固定效应模型,可以极大提到估计效率,且选项多样,如支持多维聚类
缺点 运行速度慢,结果呈现过多不太需要的固定效应的结果 需要手动添加时间固定效应 需要手动添加时间固定效应
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)+--------------------------------------------------------------------------------+(converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 3, 528) = 262423.36Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9995 Adj R-squared = 0.9995 Within R-sq. = 0.9993 Root MSE = 1.0154+-------------------------------------------------------------------------------+ | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval]+-------------------------------------------------------------------------------+ c.D#c.post | 9.868932 .1654078 59.66 0.000 9.543994 10.19387 x1 | 4.998123 .0060976 819.69 0.000 4.986144 5.010101 x2 | 3.003366 .0084909 353.71 0.000 2.986685 3.020046+-------------------------------------------------------------------------------+ Absorbed degrees of freedom:+--------------------------------------------------------+ Absorbed FE | Num. Coefs. = Categories Redundant +--------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +--------------------------------------------------------+

将回归结果存储,并且放在同一张表格中输出。

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///四组回归结果输出如下reg y c.D#c.post x1 x2 i.time i.id, reststo regxtreg y c.D#c.post x1 x2 i.time, r feeststo xtreg_feareg y c.D#c.post x1 x2 i.time, absorb(id) robusteststo aregreghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)eststo reghdfe
estout *, title("The Comparison of Actual Parameter Values") /// cells(b(star fmt(%9.3f)) se(par)) /// stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) /// legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
+--------------------------------------------------------------------------------+ | The Comparison of Actual Parameter Values+--------------------------------------------------------------------------------+ | reg xtreg_fe areg reghdfe +--------------------------------------------------------------------------------+ c.D#c.post | 9.869*** 9.869*** 9.869*** 9.869*** | (0.166) (0.189) (0.166) (0.166) x1 | 4.998*** 4.998*** 4.998*** 4.998*** | (0.006) (0.006) (0.006) (0.006) x2 | 3.003*** 3.003*** 3.003*** 3.003*** | (0.008) (0.008) (0.008) (0.008) +--------------------------------------------------------------------------------+ N | 600 600 600 600 Groups | 60 +--------------------------------------------------------------------------------+ * p<0.05, ** p<0.01, *** p<0.001+--------------------------------------------------------------------------------+

四组回归结果中交互项和协变量的估计系数和标准误几乎完全相同(唯一的区别在于 xtreg_fe 估计交互项系数的标准差),并且都十分接近于模拟生成时的设定值,说明方程设定和使用的估计方法是合适的。从上面四个回归结果可以知道,交互项的估计系数均为9.86,其95%置信区间明显包含政策的真实效果10。因此,上述四个命令在估计双向固定效应上作用是等价的。

(补充:在本文的模拟中主要使用的是固定效应模型,那么如果使用随机效应模型来估计,方程的形式和结果是否会有变化?这一问题留给感兴趣的读者去思考。PS:仔细观察一下本文的数据生成过程(Data Generating Process, DGP))。

连享会 - 效率分析专题

已上线:可随时购买学习+全套课件,课程主页 已经放置板书和 FAQs
主讲嘉宾:连玉君 | 鲁晓东 | 张宁
课程主页微信版https://gitee.com/arlionn/TE

连享会-效率分析专题视频
连享会-效率分析专题视频

4. 政策效果随时间而变的 Standard DID 的模拟

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe rpredict e,uebinscatter e time, line(connect) by(D)
图2
图2

利用双向固定效应(two-way fixed effects)的方法估计交互项的系数,其中用到 reg, xtreg, areg, reghdfe 等多个 Stata 命令。在本文中,主要展示命令'reghdfe'的输出结果。

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
. reghdfe y c.D#c.post x1 x2, absorb(id time)    vce(robust)+--------------------------------------------------------------------------------+(converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 3, 528) = 44552.05Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9978 Adj R-squared = 0.9975 Within R-sq. = 0.9964 Root MSE = 2.4029+--------------------------------------------------------------------------------+ | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval]+--------------------------------------------------------------------------------+ c.D#c.post | 31.84782 .3931208 81.01 0.000 31.07555 32.6201 x1 | 5.014706 .0154247 325.11 0.000 4.984405 5.045008 x2 | 2.973101 .0202111 147.10 0.000 2.933397 3.012805+--------------------------------------------------------------------------------+ Absorbed degrees of freedom:+-------------------------------------------------------------+ Absorbed FE| Num. Coefs. = Categories Redundant +-------------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +-------------------------------------------------------------+

将回归结果存储,并且放在同一张表格中输出。

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///四组回归结果输出如下reg y c.D#c.post x1 x2 i.time i.id, reststo regxtreg y c.D#c.post x1 x2 i.time, r feeststo xtreg_feareg y c.D#c.post x1 x2 i.time, absorb(id) robusteststo aregreghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)eststo reghdfe
estout *, title("The Comparison of Actual Parameter Values") /// cells(b(star fmt(%9.3f)) se(par)) /// stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) /// legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
+--------------------------------------------------------------------------------+ | The Comparison of Actual Parameter Values+--------------------------------------------------------------------------------+ | reg xtreg_fe areg reghdfe +--------------------------------------------------------------------------------+ c.D#c.post | 31.848*** 31.848*** 31.848*** 31.848*** | (0.394) (0.194) (0.394) (0.394) x1 | 5.015*** 5.015*** 5.015*** 5.015*** | (0.015) (0.016) (0.015) (0.015) x2 | 2.973*** 2.973*** 2.973*** 2.973*** | (0.020) (0.019) (0.020) (0.020) +--------------------------------------------------------------------------------+ N | 600 600 600 600 Groups | 60 +--------------------------------------------------------------------------------+ * p<0.05, ** p<0.01, *** p<0.001+--------------------------------------------------------------------------------+

四组回归结果中交互项和协变量的估计系数和标准误几乎完全相同(唯一的区别在于 xtreg_fe 估计项系数的标准差),并且都十分接近于模拟生成时的设定值,说明方程设定和使用的估计方法是合适的。从上面四个回归结果可以知道,交互项的估计系数均为31.848。再次验证了上述四个命令在估计双向固定效应(two-way fixed effects)上作用是等价的。

(补充:在本文的模拟中主要使用的是固定效应模型,那么如果使用随机效应模型来估计,方程的形式和结果是否会有变化?这一问题留给感兴趣的读者去思考。PS:仔细观察一下本文的数据生成过程(Data Generating Process, DGP))。

5. Standard DID 和 Event Study Approach 的结合

本部分介绍政策效果随时间发生变化和不随时间发生变化两种情况下,如何与时间研究法(Event Study Approach)相结合。 DID 应用 ESA 方法有两个目的:一是可以利用回归方法对双重差分法中最重要的平行趋势假设进行检验,与上面的图示法相比,好处在于可以控制协变量的影响,方程形式也更加灵活;二是可以更加清楚地得到政策效果在时间维度上地变化。因此,这部分检验在论文中也往往被称为政策的动态效果(Dynamic Effects) 或者灵活的 DID (Flexible DID Estimates).具体实现结果如下:

5.1 灵活的 DID :政策效果不随时间发生变化

Standard DID 的动态效果也有多种语法命令的形式,本文提供两种,主要应用 StataFactor Variables 语法。

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10,且不随时间发生变化,直到最后一期
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///预先生成年度虚拟变量tab time, gen(year)

方法1如下:

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
. reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)+--------------------------------------------------------------------------------+(converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 11, 520) = 71307.67Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9995 Adj R-squared = 0.9995 Within R-sq. = 0.9993 Root MSE = 1.0147
+--------------------------------------------------------------------------------+ | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval]+--------------------------------------------------------------------------------+D#time |1 2001 | .4381916 .3796627 1.15 0.249 -.3076697 1.1840531 2002 | .6093975 .3984095 1.53 0.127 -.1732924 1.3920871 2003 | .4808495 .3948783 1.22 0.224 -.2949033 1.2566021 2004 | .1168801 .4088713 0.29 0.775 -.6863626 .92012271 2005 | 9.810181 .3870237 25.35 0.000 9.049859 10.57051 2006 | 10.48194 .3664986 28.60 0.000 9.761937 11.201941 2007 | 9.999201 .3978656 25.13 0.000 9.217579 10.780821 2008 | 10.2474 .4087051 25.07 0.000 9.444481 11.050311 2009 | 10.45248 .3979999 26.26 0.000 9.670592 11.23436 x1 | 4.996797 .0061877 807.54 0.000 4.984641 5.008953 x2 | 3.004127 .0087679 342.63 0.000 2.986902 3.021352+--------------------------------------------------------------------------------+ Absorbed degrees of freedom:+-------------------------------------------------------------+ Absorbed FE| Num. Coefs. = Categories - Redundant +-------------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +-------------------------------------------------------------+

方法2如下:

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
. reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)+--------------------------------------------------------------------------------+(converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 11, 520) = 71307.67Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9995 Adj R-squared = 0.9995 Within R-sq. = 0.9993 Root MSE = 1.0147+--------------------------------------------------------------------------------+ | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval]+--------------------------------------------------------------------------------+c.D#c.year2 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053c.D#c.year3 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087c.D#c.year4 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602c.D#c.year5 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227c.D#c.year6 | 9.810181 .3870237 25.35 0.000 9.049859 10.5705c.D#c.year7 | 10.48194 .3664986 28.60 0.000 9.761937 11.20194c.D#c.year8 | 9.999201 .3978656 25.13 0.000 9.217579 10.78082c.D#c.year9 | 10.2474 .4087051 25.07 0.000 9.444481 11.05031c.D#c.year10| 10.45248 .3979999 26.26 0.000 9.670592 11.23436 x1| 4.996797 .0061877 807.54 0.000 4.984641 5.008953 x2| 3.004127 .0087679 342.63 0.000 2.986902 3.021352+--------------------------------------------------------------------------------+ Absorbed degrees of freedom:+-------------------------------------------------------------+ Absorbed FE| Num. Coefs. = Categories - Redundant +-------------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +-------------------------------------------------------------+

方法1和方法2的主要差异在于:方法2通过手动生成时间虚拟变量,构造出交互项,因此可以指定某一个组别作为参照组,在本文中将样本中的第一期作为基准组,另外在论文中常用的还有讲政策实施的前一期和当期作为基准参照组。除此之外,两种方程设定所得到的估计系数完全一致,以第一期为基期,第二期和第五期的系数的T值远小于2,因此,可以认为它们与0没有显著差异,从而验证了平行趋势假设。第六期到第十期的系数分别是9.81,10.48,9.999,10.25,10.45,其95%CI都包含真实效应10,因此,该方程设定是可靠的。系数的图形化表达如下:

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
coefplot, ///   keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10)  ///   coeflabels(c.D#c.year2 = "-4"   ///   c.D#c.year3 = "-3"           ///   c.D#c.year4 = "-2"           ///   c.D#c.year5 = "-1"           ///   c.D#c.year6  = "0"             ///   c.D#c.year7  = "1"              ///   c.D#c.year8  = "2"              ///   c.D#c.year9  = "3"              ///   c.D#c.year10 = "4")            ///   vertical                             ///   yline(0)                             ///   ytitle("Coef")                 ///   xtitle("Time passage relative to year of adoption of implied contract exception") ///   addplot(line @b @at)                 ///   ciopts(recast(rcap))                 ///   scheme(s1mono)
图3
图3

5.2 灵活的 DID : 政策效果随时间发生变化

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///预先生成年度虚拟变量tab time, gen(year)

方法1如下:

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)----------------------------------------------------------------------(converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 11, 520) = 75233.93Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9996 Adj R-squared = 0.9996 Within R-sq. = 0.9994 Root MSE = 1.0147+--------------------------------------------------------------------- | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval]+--------------------------------------------------------------------- D#time |1 2001 | .4381916 .3796627 1.15 0.249 -.3076697 1.1840531 2002 | .6093975 .3984095 1.53 0.127 -.1732924 1.3920871 2003 | .4808495 .3948783 1.22 0.224 -.2949033 1.2566021 2004 | .1168801 .4088713 0.29 0.775 -.6863626 .92012271 2005 | 23.81018 .3870237 61.52 0.000 23.04986 24.57051 2006 | 28.48194 .3664986 77.71 0.000 27.76194 29.201941 2007 | 31.9992 .3978656 80.43 0.000 31.21758 32.780821 2008 | 36.2474 .4087051 88.69 0.000 35.44448 37.050311 2009 | 40.45248 .3979999 101.64 0.000 39.67059 41.23436+--------------------------------------------------------------------- x1 | 4.996797 .0061877 807.54 0.000 4.984641 5.008953 x2 | 3.004127 .0087679 342.63 0.000 2.986902 3.021352+---------------------------------------------------------------------
Absorbed degrees of freedom:+-------------------------------------------------------------+ Absorbed FE| Num. Coefs. = Categories Redundant +-------------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +-------------------------------------------------------------+
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
reghdfe y c.D#(c.year2-year10) x1 x2 , absorb(id time) vce(robust)+--------------------------------------------------------------------------------+        (converged in 3 iterations)
HDFE Linear regression Number of obs = 600Absorbing 2 HDFE groups F( 11, 520) = 75233.93Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.9996 Adj R-squared = 0.9996 Within R-sq. = 0.9994 Root MSE = 1.0147+---------------------------------------------------------------------------- | Robust y | Coef. Std. Err. t P>t [95% Conf. Interval] +---------------------------------------------------------------------------- c.D#c.year2 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053c.D#c.year3 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087c.D#c.year4 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602c.D#c.year5 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227c.D#c.year6 | 23.81018 .3870237 61.52 0.000 23.04986 24.5705c.D#c.year7 | 28.48194 .3664986 77.71 0.000 27.76194 29.20194c.D#c.year8 | 31.9992 .3978656 80.43 0.000 31.21758 32.78082c.D#c.year9 | 36.2474 .4087051 88.69 0.000 35.44448 37.05031c.D#c.year10| 40.45248 .3979999 101.64 0.000 39.67059 41.23436 x1| 4.996797 .0061877 807.54 0.000 4.984641 5.008953 x2| 3.004127 .0087679 342.63 0.000 2.986902 3.021352+----------------------------------------------------------------------------
Absorbed degrees of freedom:+-------------------------------------------------------------+ Absorbed FE| Num. Coefs. = Categories Redundant +-------------------------------------------------------------+ id | 60 60 0 time | 9 10 1 +-------------------------------------------------------------+

上面两种方法的估计结果完全一致,前五期的T值远小于2,因此可以认为这些估计系数与0无显著差异,而且从第六期开始,估计系数分别为23.81,28.48,32.00,36.25,40.45,这与模拟程序设定的真实值十分接近。下一步是将回归方程的结果通过图形更加直观的呈现出来,具体如下:

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
coefplot, ///   keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10)  ///   coeflabels(c.D#c.year2 = "-4"   ///   c.D#c.year3 = "-3"          ///   c.D#c.year4 = "-2"          ///   c.D#c.year5 = "-1"          ///   c.D#c.year6  = "0"          ///   c.D#c.year7  = "1"          ///   c.D#c.year8  = "2"          ///   c.D#c.year9  = "3"          ///   c.D#c.year10 = "4")         ///   vertical                    ///   yline(0)                    ///   ytitle("Coef")              ///   xtitle("Time passage relative to year of adoption of implied contract exception") ///   addplot(line @b @at)        ///   ciopts(recast(rcap))        ///   scheme(s1mono)
图4
图4

6. 总结和拓展

本文是本系列推文的第一篇,其主要内容是介绍了政策效果不随时间而变和政策效果随时间而变的 DID 模型的模拟,结合 Event Study Approach 对 DID 最重要的平行趋势假设进行了检验,另外利用 coefplot 命令对 DID 方法所估计的政策动态效果进行了图形化展示。

本系列的下一篇推文将介绍 Time-varying DID 的模拟,平行趋势检验在不同情形下的模型设定的差异以及政策动态效果的图形化表达。

7. 参考资料

连享会 - 文本分析与爬虫 - 专题视频

主讲嘉宾:司继春 || 游万海

连享会-文本分析与爬虫-专题视频教程
连享会-文本分析与爬虫-专题视频教程

附:全部代码

基础数据结构的生成和保存

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///设定60个观测值,设定随机数种子
clear allset obs 60 set seed 10101gen id =_n
/// 每一个数值的数量扩大11倍,再减去前六十个观测值,即60*11-60 = 600,为60个体10年的面板数据
expand 11drop in 1/60count
///以id分组生成时间标识
bys id: gen time = _n+1999xtset id time
///生成协变量x1, x2
gen x1 = rnormal(1,7)gen x2 = rnormal(2,5)
///生成个体固定效应和时间固定效应
sort time idby time: gen ind = _nsort id timeby id: gen T = _n
///生成treat和post变量,以2005年为接受政策干预的时点,id为30-60的个体为处理组,其余为控制组
gen D = 0 replace D = 1 if id > 29gen post = 0replace post = 1 if time >= 2005
///将基础数据结构保存成dta文件,命名为DID_Basic_Simu.dta,默认保存在当前的 working directory 路径下
save "DID_Basic_Simu.dta",replace

政策效果不随时间而变DID模拟的代码

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe rpredict e,uebinscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600graph export "article1_1.png",as(png) replace width(800) height(600)
///多种回归形式reg y c.D#c.post x1 x2 i.time i.id, reststo regxtreg y c.D#c.post x1 x2 i.time, r feeststo xtreg_feareg y c.D#c.post x1 x2 i.time, absorb(id) robusteststo aregreghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)eststo reghdfe
estout *, title("The Comparison of Actual Paramerer Values") /// cells(b(star fmt(%9.3f)) se(par)) /// stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) /// legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
///ESA及图示法///预先生成年度虚拟变量tab time, gen(year)reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)coefplot, /// keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) /// coeflabels(c.D#c.year2 = "-4" /// c.D#c.year3 = "-3" /// c.D#c.year4 = "-2" /// c.D#c.year5 = "-1" /// c.D#c.year6 = "0" /// c.D#c.year7 = "1" /// c.D#c.year8 = "2" /// c.D#c.year9 = "3" /// c.D#c.year10 = "4") /// vertical /// yline(0) /// ytitle("Coef") /// xtitle("Time passage relative to year of adoption of implied contract exception") /// addplot(line @b @at) /// ciopts(recast(rcap)) /// scheme(s1mono)
///输出生成的图片,令格式为800*600 graph export "article1_3.png",as(png) replace width(800) height(600)

政策效果随时间而变DID模拟的代码

  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
  • ounter(line
///调用本文第二部分生成的基础数据结构clearuse "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe rpredict e,uebinscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600graph export "article1_2.png",as(png) replace width(800) height(600)
///多种回归形式reg y c.D#c.post x1 x2 i.time i.id, reststo regxtreg y c.D#c.post x1 x2 i.time, r feeststo xtreg_feareg y c.D#c.post x1 x2 i.time, absorb(id) robusteststo aregreghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)eststo reghdfe
estout *, title("The Comparison of Actual Paramerer Values") /// cells(b(star fmt(%9.3f)) se(par)) /// stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) /// legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
///ESA及图示法///预先生成年度虚拟变量tab time, gen(year)reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)coefplot, /// keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) /// coeflabels(c.D#c.year2 = "-4" /// c.D#c.year3 = "-3" /// c.D#c.year4 = "-2" /// c.D#c.year5 = "-1" /// c.D#c.year6 = "0" /// c.D#c.year7 = "1" /// c.D#c.year8 = "2" /// c.D#c.year9 = "3" /// c.D#c.year10 = "4") /// vertical /// yline(0) /// ytitle("Coef") /// xtitle("Time passage relative to year of adoption of implied contract exception") /// addplot(line @b @at) /// ciopts(recast(rcap)) /// scheme(s1mono)
///输出生成的图片,令格式为800*600 graph export "article1_4.png",as(png) replace width(800) height(600)

相关课程

连享会-直播课 上线了!
http://lianxh.duanshu.com

免费公开课:


课程一览

支持回看,所有课程可以随时购买观看。

专题 嘉宾 直播/回看视频
Stata暑期班 连玉君
江艇
线上直播 9 天
2020.7.28-8.7
效率分析-专题 连玉君
鲁晓东
张 宁
视频-TFP-SFA-DEA
已上线,3天
文本分析/爬虫 游万海
司继春
视频-文本分析与爬虫
已上线,4天
空间计量系列 范巧 空间全局模型, 空间权重矩阵
空间动态面板, 空间DID
研究设计 连玉君 我的特斯拉-实证研究设计-幻灯片-
面板模型 连玉君 动态面板模型-幻灯片-
直击面板数据模型 [免费公开课,2小时]

Note: 部分课程的资料,PPT 等可以前往 连享会-直播课 主页查看,下载。


关于我们

  • Stata连享会 由中山大学连玉君老师团队创办,定期分享实证分析经验。直播间 有很多视频课程,可以随时观看。
  • 连享会-主页知乎专栏,300+ 推文,实证分析不再抓狂。
  • 公众号推文分类: 计量专题 | 分类推文 | 资源工具。推文分成 内生性 | 空间计量 | 时序面板 | 结果输出 | 交乘调节 五类,主流方法介绍一目了然:DID, RDD, IV, GMM, FE, Probit 等。
  • 公众号关键词搜索/回复 功能已经上线。大家可以在公众号左下角点击键盘图标,输入简要关键词,以便快速呈现历史推文,获取工具软件和数据下载。常见关键词:
    • 课程, 直播, 视频, 客服, 模型设定, 研究设计, 暑期班
    • stata, plus,Profile, 手册, SJ, 外部命令, profile, mata, 绘图, 编程, 数据, 可视化
    • DID,RDD, PSM,IV,DID, DDD, 合成控制法,内生性, 事件研究, 交乘, 平方项, 缺失值, 离群值, 缩尾, R2, 乱码, 结果
    • Probit, Logit, tobit, MLE, GMM, DEA, Bootstrap, bs, MC, TFP, 面板, 直击面板数据, 动态面板, VAR, 生存分析, 分位数
    • 空间, 空间计量, 连老师, 直播, 爬虫, 文本, 正则, python
    • Markdown, Markdown幻灯片, marp, 工具, 软件, Sai2, gInk, Annotator, 手写批注, 盈余管理, 特斯拉, 甲壳虫, 论文重现, 易懂教程, 码云, 教程, 知乎

连享会主页  lianxh.cn
连享会主页 lianxh.cn

连享会小程序:扫一扫,看推文,看视频……


扫码加入连享会微信群,提问交流更方便

✏ 连享会学习群-常见问题解答汇总:
https://gitee.com/arlionn/WD