温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。
New!
lianxh
命令发布了:
随时搜索推文、Stata 资源。安装命令如下:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
⛳ Stata 系列推文:
作者:王昆仑 (天津大学)
E-mail: shawn0513@163.com
目录
本文来自 连享会专题 - 倍分法 DID
双重差分模型 (Difference-Differences, DID)是政策评估的非实验方法中最为常用的一种方法。本系列推文的目的是,利用模拟的方式直观的展示 Standard DID,Time-varying DID 以及其于Event Study Approach (ESA) 的结合应用。
关于 DID 方法的推文已经不少了,仅连享会就推送过多篇专题文章,还有利用 margins 命令来计算 DID 模型的边际效果的推文等等,详见 连享会专题 - 倍分法 DID。
另外,李春涛老师的爬虫俱乐部公众号也用两期的篇幅来对 DID 模型和其平行趋势的检验做了简要的介绍。对于 DID 方法可能还不熟悉的同学,可以结合 「陈强老师的推文」和参考资料里提到的内容一起学习。
与上述的文章相比,连享会此次推出的 倍分法系列推文 有以下三个特色:
我们生成了一份模拟数据,结构为 60个体*10年,共 600 个观测值的平衡面板数据。我们在数据中设定 2005 年为政策发生当年,id 取值在 30-60 范围内的个体为处理组,剩余的个体为控制组。我们将基础的文件保存为 DID_Basic_Simu.dta (Stata 格式的数据文件),方便随后调用。
///设定60个观测值,设定随机数种子
clear all
set obs 60
set seed 10101
gen id =_n
/// 每一个数值的数量扩大11倍,再减去前六十个观测值,即60*11-60 = 600,为60个体10年的面板数据
expand 11
drop in 1/60
count
///以id分组生成时间标识
bys id: gen time = _n+1999
xtset id time
///生成协变量x1, x2
gen x1 = rnormal(1,7)
gen x2 = rnormal(2,5)
///生成个体固定效应和时间固定效应
sort time id
by time: gen ind = _n
sort id time
by id: gen T = _n
//生成treat和post变量,以2005年为接受政策干预的时点,id为30-60的个体为处理组,其余为控制组
gen D = 0
replace D = 1 if id > 29
gen post = 0
replace post = 1 if time >= 2005
///将基础数据结构保存成dta文件,命名为DID_Basic_Simu.dta, 默认保存在当前的 working directory 路径下
save "DID_Basic_Simu.dta",replace
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10,且不随时间发生变化,直到最后一期
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe r
predict e,ue
binscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600
graph export "article1_1.png",as(png) replace width(800) height(600)
利用双向固定效应 (two-way fixed effects) 的方法估计交互项的系数,其中用到 reg
, xtreg
, areg
, reghdfe
等多个 Stata 命令,四个命令的比较放在了下方的表格中。在本文中,主要展示 reghdfe
命令的输出结果,该命令的具体介绍可以参考 「Stata: reghdfe-多维固定效应」。
reg | xtreg | areg | reghdfe | |
---|---|---|---|---|
个体固定效应 | i.id | xtreg,fe | areg,absorb(id) | absorb(id time) |
时间固定效应 | i.time | i.time | i.time | absorb(id time) |
估计方法 | OLS | 组内去平均后OLS | OLS | OLS |
优点 | 命令熟悉,逻辑清晰 | 固定效应模型的官方命令 | 官方命令,可以提高组别不随样本规模增加的估计效率 | 高维固定效应模型,可以极大提到估计效率,且选项多样,如支持多维聚类 |
缺点 | 运行速度慢,结果呈现过多不太需要的固定效应的结果 | 需要手动添加时间固定效应 | 需要手动添加时间固定效应 | 无 |
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
+--------------------------------------------------------------------------------+
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 3, 528) = 262423.36
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9995
Adj R-squared = 0.9995
Within R-sq. = 0.9993
Root MSE = 1.0154
+-------------------------------------------------------------------------------+
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+-------------------------------------------------------------------------------+
c.D#c.post | 9.868932 .1654078 59.66 0.000 9.543994 10.19387
x1 | 4.998123 .0060976 819.69 0.000 4.986144 5.010101
x2 | 3.003366 .0084909 353.71 0.000 2.986685 3.020046
+-------------------------------------------------------------------------------+
Absorbed degrees of freedom:
+--------------------------------------------------------+
Absorbed FE | Num. Coefs. = Categories Redundant
+--------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+--------------------------------------------------------+
将回归结果存储,并且放在同一张表格中输出。
///四组回归结果输出如下
reg y c.D#c.post x1 x2 i.time i.id, r
eststo reg
xtreg y c.D#c.post x1 x2 i.time, r fe
eststo xtreg_fe
areg y c.D#c.post x1 x2 i.time, absorb(id) robust
eststo areg
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
eststo reghdfe
estout *, title("The Comparison of Actual Parameter Values") ///
cells(b(star fmt(%9.3f)) se(par)) ///
stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) ///
legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
+--------------------------------------------------------------------------------+
| The Comparison of Actual Parameter Values
+--------------------------------------------------------------------------------+
| reg xtreg_fe areg reghdfe
+--------------------------------------------------------------------------------+
c.D#c.post | 9.869*** 9.869*** 9.869*** 9.869***
| (0.166) (0.189) (0.166) (0.166)
x1 | 4.998*** 4.998*** 4.998*** 4.998***
| (0.006) (0.006) (0.006) (0.006)
x2 | 3.003*** 3.003*** 3.003*** 3.003***
| (0.008) (0.008) (0.008) (0.008)
+--------------------------------------------------------------------------------+
N | 600 600 600 600
Groups | 60
+--------------------------------------------------------------------------------+
* p<0.05, ** p<0.01, *** p<0.001
+--------------------------------------------------------------------------------+
四组回归结果中交互项和协变量的估计系数和标准误几乎完全相同(唯一的区别在于 xtreg_fe 估计交互项系数的标准差),并且都十分接近于模拟生成时的设定值,说明方程设定和使用的估计方法是合适的。从上面四个回归结果可以知道,交互项的估计系数均为9.86,其95%置信区间明显包含政策的真实效果10。因此,上述四个命令在估计双向固定效应上作用是等价的。
(补充:在本文的模拟中主要使用的是固定效应模型,那么如果使用随机效应模型来估计,方程的形式和结果是否会有变化?这一问题留给感兴趣的读者去思考。PS:仔细观察一下本文的数据生成过程(Data Generating Process, DGP))。
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe r
predict e,ue
binscatter e time, line(connect) by(D)
利用双向固定效应(two-way fixed effects)的方法估计交互项的系数,其中用到 reg
, xtreg
, areg
, reghdfe
等多个 Stata
命令。在本文中,主要展示命令'reghdfe'的输出结果。
. reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
+--------------------------------------------------------------------------------+
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 3, 528) = 44552.05
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9978
Adj R-squared = 0.9975
Within R-sq. = 0.9964
Root MSE = 2.4029
+--------------------------------------------------------------------------------+
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+--------------------------------------------------------------------------------+
c.D#c.post | 31.84782 .3931208 81.01 0.000 31.07555 32.6201
x1 | 5.014706 .0154247 325.11 0.000 4.984405 5.045008
x2 | 2.973101 .0202111 147.10 0.000 2.933397 3.012805
+--------------------------------------------------------------------------------+
Absorbed degrees of freedom:
+-------------------------------------------------------------+
Absorbed FE| Num. Coefs. = Categories Redundant
+-------------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+-------------------------------------------------------------+
将回归结果存储,并且放在同一张表格中输出。
///四组回归结果输出如下
reg y c.D#c.post x1 x2 i.time i.id, r
eststo reg
xtreg y c.D#c.post x1 x2 i.time, r fe
eststo xtreg_fe
areg y c.D#c.post x1 x2 i.time, absorb(id) robust
eststo areg
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
eststo reghdfe
estout *, title("The Comparison of Actual Parameter Values") ///
cells(b(star fmt(%9.3f)) se(par)) ///
stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) ///
legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
+--------------------------------------------------------------------------------+
| The Comparison of Actual Parameter Values
+--------------------------------------------------------------------------------+
| reg xtreg_fe areg reghdfe
+--------------------------------------------------------------------------------+
c.D#c.post | 31.848*** 31.848*** 31.848*** 31.848***
| (0.394) (0.194) (0.394) (0.394)
x1 | 5.015*** 5.015*** 5.015*** 5.015***
| (0.015) (0.016) (0.015) (0.015)
x2 | 2.973*** 2.973*** 2.973*** 2.973***
| (0.020) (0.019) (0.020) (0.020)
+--------------------------------------------------------------------------------+
N | 600 600 600 600
Groups | 60
+--------------------------------------------------------------------------------+
* p<0.05, ** p<0.01, *** p<0.001
+--------------------------------------------------------------------------------+
四组回归结果中交互项和协变量的估计系数和标准误几乎完全相同(唯一的区别在于 xtreg_fe 估计项系数的标准差),并且都十分接近于模拟生成时的设定值,说明方程设定和使用的估计方法是合适的。从上面四个回归结果可以知道,交互项的估计系数均为31.848。再次验证了上述四个命令在估计双向固定效应(two-way fixed effects)上作用是等价的。
(补充:在本文的模拟中主要使用的是固定效应模型,那么如果使用随机效应模型来估计,方程的形式和结果是否会有变化?这一问题留给感兴趣的读者去思考。PS:仔细观察一下本文的数据生成过程(Data Generating Process, DGP))。
本部分介绍政策效果随时间发生变化和不随时间发生变化两种情况下,如何与时间研究法(Event Study Approach)相结合。 DID 应用 ESA 方法有两个目的:一是可以利用回归方法对双重差分法中最重要的平行趋势假设进行检验,与上面的图示法相比,好处在于可以控制协变量的影响,方程形式也更加灵活;二是可以更加清楚地得到政策效果在时间维度上地变化。因此,这部分检验在论文中也往往被称为政策的动态效果(Dynamic Effects) 或者灵活的 DID (Flexible DID Estimates).具体实现结果如下:
Standard DID 的动态效果也有多种语法命令的形式,本文提供两种,主要应用 Stata
的 Factor Variables
语法。
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10,且不随时间发生变化,直到最后一期
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///预先生成年度虚拟变量
tab time, gen(year)
方法1如下:
. reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)
+--------------------------------------------------------------------------------+
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 11, 520) = 71307.67
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9995
Adj R-squared = 0.9995
Within R-sq. = 0.9993
Root MSE = 1.0147
+--------------------------------------------------------------------------------+
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+--------------------------------------------------------------------------------+
D#time |
1 2001 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053
1 2002 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087
1 2003 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602
1 2004 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227
1 2005 | 9.810181 .3870237 25.35 0.000 9.049859 10.5705
1 2006 | 10.48194 .3664986 28.60 0.000 9.761937 11.20194
1 2007 | 9.999201 .3978656 25.13 0.000 9.217579 10.78082
1 2008 | 10.2474 .4087051 25.07 0.000 9.444481 11.05031
1 2009 | 10.45248 .3979999 26.26 0.000 9.670592 11.23436
x1 | 4.996797 .0061877 807.54 0.000 4.984641 5.008953
x2 | 3.004127 .0087679 342.63 0.000 2.986902 3.021352
+--------------------------------------------------------------------------------+
Absorbed degrees of freedom:
+-------------------------------------------------------------+
Absorbed FE| Num. Coefs. = Categories - Redundant
+-------------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+-------------------------------------------------------------+
方法2如下:
. reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)
+--------------------------------------------------------------------------------+
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 11, 520) = 71307.67
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9995
Adj R-squared = 0.9995
Within R-sq. = 0.9993
Root MSE = 1.0147
+--------------------------------------------------------------------------------+
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+--------------------------------------------------------------------------------+
c.D#c.year2 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053
c.D#c.year3 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087
c.D#c.year4 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602
c.D#c.year5 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227
c.D#c.year6 | 9.810181 .3870237 25.35 0.000 9.049859 10.5705
c.D#c.year7 | 10.48194 .3664986 28.60 0.000 9.761937 11.20194
c.D#c.year8 | 9.999201 .3978656 25.13 0.000 9.217579 10.78082
c.D#c.year9 | 10.2474 .4087051 25.07 0.000 9.444481 11.05031
c.D#c.year10| 10.45248 .3979999 26.26 0.000 9.670592 11.23436
x1| 4.996797 .0061877 807.54 0.000 4.984641 5.008953
x2| 3.004127 .0087679 342.63 0.000 2.986902 3.021352
+--------------------------------------------------------------------------------+
Absorbed degrees of freedom:
+-------------------------------------------------------------+
Absorbed FE| Num. Coefs. = Categories - Redundant
+-------------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+-------------------------------------------------------------+
方法1和方法2的主要差异在于:方法2通过手动生成时间虚拟变量,构造出交互项,因此可以指定某一个组别作为参照组,在本文中将样本中的第一期作为基准组,另外在论文中常用的还有讲政策实施的前一期和当期作为基准参照组。除此之外,两种方程设定所得到的估计系数完全一致,以第一期为基期,第二期和第五期的系数的T值远小于2,因此,可以认为它们与0没有显著差异,从而验证了平行趋势假设。第六期到第十期的系数分别是9.81,10.48,9.999,10.25,10.45
,其95%CI都包含真实效应10,因此,该方程设定是可靠的。系数的图形化表达如下:
coefplot, ///
keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) ///
coeflabels(c.D#c.year2 = "-4" ///
c.D#c.year3 = "-3" ///
c.D#c.year4 = "-2" ///
c.D#c.year5 = "-1" ///
c.D#c.year6 = "0" ///
c.D#c.year7 = "1" ///
c.D#c.year8 = "2" ///
c.D#c.year9 = "3" ///
c.D#c.year10 = "4") ///
vertical ///
yline(0) ///
ytitle("Coef") ///
xtitle("Time passage relative to year of adoption of implied contract exception") ///
addplot(line @b @at) ///
ciopts(recast(rcap)) ///
scheme(s1mono)
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///预先生成年度虚拟变量
tab time, gen(year)
方法1如下:
reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)
----------------------------------------------------------------------
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 11, 520) = 75233.93
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9996
Adj R-squared = 0.9996
Within R-sq. = 0.9994
Root MSE = 1.0147
+---------------------------------------------------------------------
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+---------------------------------------------------------------------
D#time |
1 2001 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053
1 2002 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087
1 2003 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602
1 2004 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227
1 2005 | 23.81018 .3870237 61.52 0.000 23.04986 24.5705
1 2006 | 28.48194 .3664986 77.71 0.000 27.76194 29.20194
1 2007 | 31.9992 .3978656 80.43 0.000 31.21758 32.78082
1 2008 | 36.2474 .4087051 88.69 0.000 35.44448 37.05031
1 2009 | 40.45248 .3979999 101.64 0.000 39.67059 41.23436
+---------------------------------------------------------------------
x1 | 4.996797 .0061877 807.54 0.000 4.984641 5.008953
x2 | 3.004127 .0087679 342.63 0.000 2.986902 3.021352
+---------------------------------------------------------------------
Absorbed degrees of freedom:
+-------------------------------------------------------------+
Absorbed FE| Num. Coefs. = Categories Redundant
+-------------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+-------------------------------------------------------------+
reghdfe y c.D#(c.year2-year10) x1 x2 , absorb(id time) vce(robust)
+--------------------------------------------------------------------------------+
(converged in 3 iterations)
HDFE Linear regression Number of obs = 600
Absorbing 2 HDFE groups F( 11, 520) = 75233.93
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.9996
Adj R-squared = 0.9996
Within R-sq. = 0.9994
Root MSE = 1.0147
+----------------------------------------------------------------------------
| Robust
y | Coef. Std. Err. t P>t [95% Conf. Interval]
+----------------------------------------------------------------------------
c.D#c.year2 | .4381916 .3796627 1.15 0.249 -.3076697 1.184053
c.D#c.year3 | .6093975 .3984095 1.53 0.127 -.1732924 1.392087
c.D#c.year4 | .4808495 .3948783 1.22 0.224 -.2949033 1.256602
c.D#c.year5 | .1168801 .4088713 0.29 0.775 -.6863626 .9201227
c.D#c.year6 | 23.81018 .3870237 61.52 0.000 23.04986 24.5705
c.D#c.year7 | 28.48194 .3664986 77.71 0.000 27.76194 29.20194
c.D#c.year8 | 31.9992 .3978656 80.43 0.000 31.21758 32.78082
c.D#c.year9 | 36.2474 .4087051 88.69 0.000 35.44448 37.05031
c.D#c.year10| 40.45248 .3979999 101.64 0.000 39.67059 41.23436
x1| 4.996797 .0061877 807.54 0.000 4.984641 5.008953
x2| 3.004127 .0087679 342.63 0.000 2.986902 3.021352
+----------------------------------------------------------------------------
Absorbed degrees of freedom:
+-------------------------------------------------------------+
Absorbed FE| Num. Coefs. = Categories Redundant
+-------------------------------------------------------------+
id | 60 60 0
time | 9 10 1
+-------------------------------------------------------------+
上面两种方法的估计结果完全一致,前五期的T值远小于2,因此可以认为这些估计系数与0无显著差异,而且从第六期开始,估计系数分别为23.81,28.48,32.00,36.25,40.45
,这与模拟程序设定的真实值十分接近。下一步是将回归方程的结果通过图形更加直观的呈现出来,具体如下:
coefplot, ///
keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) ///
coeflabels(c.D#c.year2 = "-4" ///
c.D#c.year3 = "-3" ///
c.D#c.year4 = "-2" ///
c.D#c.year5 = "-1" ///
c.D#c.year6 = "0" ///
c.D#c.year7 = "1" ///
c.D#c.year8 = "2" ///
c.D#c.year9 = "3" ///
c.D#c.year10 = "4") ///
vertical ///
yline(0) ///
ytitle("Coef") ///
xtitle("Time passage relative to year of adoption of implied contract exception") ///
addplot(line @b @at) ///
ciopts(recast(rcap)) ///
scheme(s1mono)
本文是本系列推文的第一篇,其主要内容是介绍了政策效果不随时间而变和政策效果随时间而变的 DID 模型的模拟,结合 Event Study Approach 对 DID 最重要的平行趋势假设进行了检验,另外利用 coefplot 命令对 DID 方法所估计的政策动态效果进行了图形化展示。
本系列的下一篇推文将介绍 Time-varying DID 的模拟,平行趋势检验在不同情形下的模型设定的差异以及政策动态效果的图形化表达。
///设定60个观测值,设定随机数种子
clear all
set obs 60
set seed 10101
gen id =_n
/// 每一个数值的数量扩大11倍,再减去前六十个观测值,即60*11-60 = 600,为60个体10年的面板数据
expand 11
drop in 1/60
count
///以id分组生成时间标识
bys id: gen time = _n+1999
xtset id time
///生成协变量x1, x2
gen x1 = rnormal(1,7)
gen x2 = rnormal(2,5)
///生成个体固定效应和时间固定效应
sort time id
by time: gen ind = _n
sort id time
by id: gen T = _n
///生成treat和post变量,以2005年为接受政策干预的时点,id为30-60的个体为处理组,其余为控制组
gen D = 0
replace D = 1 if id > 29
gen post = 0
replace post = 1 if time >= 2005
///将基础数据结构保存成dta文件,命名为DID_Basic_Simu.dta,默认保存在当前的 working directory 路径下
save "DID_Basic_Simu.dta",replace
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果为10
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + 10 + T + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe r
predict e,ue
binscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600
graph export "article1_1.png",as(png) replace width(800) height(600)
///多种回归形式
reg y c.D#c.post x1 x2 i.time i.id, r
eststo reg
xtreg y c.D#c.post x1 x2 i.time, r fe
eststo xtreg_fe
areg y c.D#c.post x1 x2 i.time, absorb(id) robust
eststo areg
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
eststo reghdfe
estout *, title("The Comparison of Actual Paramerer Values") ///
cells(b(star fmt(%9.3f)) se(par)) ///
stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) ///
legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
///ESA及图示法
///预先生成年度虚拟变量
tab time, gen(year)
reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)
reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)
coefplot, ///
keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) ///
coeflabels(c.D#c.year2 = "-4" ///
c.D#c.year3 = "-3" ///
c.D#c.year4 = "-2" ///
c.D#c.year5 = "-1" ///
c.D#c.year6 = "0" ///
c.D#c.year7 = "1" ///
c.D#c.year8 = "2" ///
c.D#c.year9 = "3" ///
c.D#c.year10 = "4") ///
vertical ///
yline(0) ///
ytitle("Coef") ///
xtitle("Time passage relative to year of adoption of implied contract exception") ///
addplot(line @b @at) ///
ciopts(recast(rcap)) ///
scheme(s1mono)
///输出生成的图片,令格式为800*600
graph export "article1_3.png",as(png) replace width(800) height(600)
///调用本文第二部分生成的基础数据结构
clear
use "DID_Basic_Simu.dta"
///生成两种潜在结果,并且合成最终的结果变量,令政策的真实效果随时间发生变化,即(5*T-T),由于从2005年开始接受干预,因此,每年的政策效果应为24,28,32,36,40.
bysort id: gen y0 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal()
bysort id: gen y1 = 10 + 5 * x1 + 3 * x2 + T + ind + rnormal() if time < 2005
bysort id: replace y1 = 10 + 5 * x1 + 3 * x2 + T * 5 + ind + rnormal() if time >= 2005
gen y = y0 + D * (y1 - y0)
///去除协变量和个体效应对y的影响,画出剩余残差的图像
xtreg y x1 x2 , fe r
predict e,ue
binscatter e time, line(connect) by(D)
///输出生成的图片,令格式为800*600
graph export "article1_2.png",as(png) replace width(800) height(600)
///多种回归形式
reg y c.D#c.post x1 x2 i.time i.id, r
eststo reg
xtreg y c.D#c.post x1 x2 i.time, r fe
eststo xtreg_fe
areg y c.D#c.post x1 x2 i.time, absorb(id) robust
eststo areg
reghdfe y c.D#c.post x1 x2, absorb(id time) vce(robust)
eststo reghdfe
estout *, title("The Comparison of Actual Paramerer Values") ///
cells(b(star fmt(%9.3f)) se(par)) ///
stats(N N_g, fmt(%9.0f %9.0g) label(N Groups)) ///
legend collabels(none) varlabels(_cons Constant) keep(x1 x2 c.D#c.post)
///ESA及图示法
///预先生成年度虚拟变量
tab time, gen(year)
reghdfe y i.D#i.time x1 x2, vce(robust) absorb(id time)
reghdfe y c.D#(c.year2-year10) x1 x2, absorb(id time) vce(robust)
coefplot, ///
keep(c.D#c.year2 c.D#c.year3 c.D#c.year4 c.D#c.year5 c.D#c.year6 c.D#c.year7 c.D#c.year8 c.D#c.year9 c.D#c.year10) ///
coeflabels(c.D#c.year2 = "-4" ///
c.D#c.year3 = "-3" ///
c.D#c.year4 = "-2" ///
c.D#c.year5 = "-1" ///
c.D#c.year6 = "0" ///
c.D#c.year7 = "1" ///
c.D#c.year8 = "2" ///
c.D#c.year9 = "3" ///
c.D#c.year10 = "4") ///
vertical ///
yline(0) ///
ytitle("Coef") ///
xtitle("Time passage relative to year of adoption of implied contract exception") ///
addplot(line @b @at) ///
ciopts(recast(rcap)) ///
scheme(s1mono)
///输出生成的图片,令格式为800*600
graph export "article1_4.png",as(png) replace width(800) height(600)
连享会-直播课 上线了!
http://lianxh.duanshu.com
免费公开课:
直击面板数据模型 - 连玉君,时长:1小时40分钟 Stata 33 讲 - 连玉君, 每讲 15 分钟. 部分直播课 课程资料下载 (PPT,dofiles等)
支持回看,所有课程可以随时购买观看。
专题 | 嘉宾 | 直播/回看视频 |
---|---|---|
⭐ 最新专题 ⭐ | DSGE, 因果推断, 空间计量等 | |
⭕ Stata数据清洗 | 游万海 | 直播, 2 小时,已上线 |
研究设计 | 连玉君 | 我的特斯拉-实证研究设计,-幻灯片- |
面板模型 | 连玉君 | 动态面板模型,-幻灯片- |
面板模型 | 连玉君 | 直击面板数据模型 [免费公开课,2小时] |
Note: 部分课程的资料,PPT 等可以前往 连享会-直播课 主页查看,下载。
关于我们
课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法
等
连享会小程序:扫一扫,看推文,看视频……
扫码加入连享会微信群,提问交流更方便
✏ 连享会学习群-常见问题解答汇总:
✨ https://gitee.com/arlionn/WD