Stata连享会 主页 || 视频 || 推文 || 知乎 || Bilibili 站
温馨提示: 定期 清理浏览器缓存,可以获得最佳浏览体验。
New!
lianxh
命令发布了:
随时搜索推文、Stata 资源。安装:
. ssc install lianxh
详情参见帮助文件 (有惊喜):
. help lianxh
连享会新命令:cnssc
,ihelp
,rdbalance
,gitee
,installpkg
⛳ Stata 系列推文:
作者:葛佳敏 (厦门大学)
邮箱:gejiamin616@126.com
目录
ietoolkit
软件包是由世界银行发展影响评估部门 (Development Impact Evaluation, DIME) 开发,用于简化数据管理和分析过程。本推文是系列推文中的第一篇,整体介绍 ietoolkit
命令组。本文介绍命令组中的核心命令 iebaltab
和 ieddtab
。在正式使用前,仍需通过以下命令语句进行安装:
ssc install ietoolkit, replace
iebaltab
命令可以生成带有多个组或处理分支的平衡表 (Note: 该命令是 balance table 的简写)。 该命令语法如下,
iebaltab balancevarlist [if] [in] [weight] , grpvar(varname) { save(
filename) | savetex(filename) | browse } [ column_options
label_options stats_options ftest_options display_options
export_options ]
其中,balancevarlist
是一个或多个变量 (此处称为平衡变量),iebaltab
命令将根据 grpvar(varname)
中类别变量检验上述变量组别差异。关于该命令更多详细用法,请参考 help ieddtab
。另外,iebaltab
命令主要实现以下功能:
reg balancevarname if groupvar = groupcode
其中,balancevarname
是 balancevarlist
中的变量,groupvar
指的是选项 grpvar(varname)
,groupcode
是分组变量。在返回结果中,_b[cons]
是组平均值,_se [cons]
是组平均值中的标准误差。
reg balancevarname testgroupdummy
test testgroupdummy
其中,testgroupdummy
是一个虚拟变量,处理组为 1,控制组为 0。根据 starlevels()
中的阈值,将代表显著性水平的星号添加到表中时,需要使用返回结果 r(p)
。
reg testgroupdummy balancevarlist
testparm balancevarlist
其中,balancevarlist
是为该命令指定的所有 balancevars
列表。
xi : reg balancevarname testgroupdummy i.fixed
test testgroupdummy
xi : reg testgroupdummy balancevarlist i.fixed
testparm balancevarlist
其中,fixed
是选项 fixeffects()
中的固定效应包含的变量。
reg balancevarname testgroupdummy covariatesvarlist
test testgroupdummy
reg testgroupdummy balancevarlist covariatesvarlist
testparm balancevarlist
其中,covariatesvarlist
是包含在选项 covariates()
中的控制变量。
reg balancevarname testgroupdummy, vce(vcetype)
test testgroupdummy
reg testgroupdummy balancevarlist, vce(vcetype)
testparm balancevarlist
xi : reg balancevarname testgroupdummy i.fixed covariatesvarlist, vce(vcetype)
test testgroupdummy
iebaltab outcome_variable, grpvar(treatment_variable) browse
其中,treatment_variable
为处理变量,即处理组,该变量取值为 1,反之为 0。该命令将分别显示处理组和控制组的 outcome_variable
平均值和标准误差,以及二者差异和显著性。
global project_folder "C:\Users\project\baseline\results"
iebaltab outcome_variable, grpvar(treatment_variable)
save("$project_folder\balancetable.xlsx")
在这里,表被保存到文件中,而不像 Example 1 一样在浏览器窗口中显示。
iebaltab outcome1 outcome2 outcome3, grpvar(treatment_variable)
save("$project_folder\balancetable.xlsx") rowvarlabels
rowlabels("outcome1 Outcome variable 1 @ outcome2 Second outcome
variable")
其中,rowlabels()
使得行标题将分别显示为 Outcome variable 1
和 Second outcome variable
,而不是 outcome1
和 outcome2
。由于未在 rowlabels()
中指定 outcome3
,因此将默认使用 outcome3
的变量名作为行标题。
. sysuse census.dta, clear
*-计算数据的相对比率
. replace death = 100 * death / pop
. replace marriage = 100 * marriage / pop
. replace divorce = 100 * divorce / pop
. gen treatment = (runiform()<.5) //随机生成处理组的虚拟变量
. iebaltab marriage divorce death, grpvar(treatment) browse //浏览
. iebaltab marriage divorce death, grpvar(treatment) save(table01) //输出 Excel 表格
上述命令的结果会呈现在 Stata 数据编辑器里,如下图所示。若使用 save(filename)
或是 savetex(filename)
可以将结果保存在 Excel 或 Tex 文件里。
(1) (2) t-test
0 1 Difference
Variable N Mean/SE N Mean/SE (1)-(2)
marriage 29 1.102 21 1.651 -0.549
[0.041] [0.633]
divorce 29 0.553 21 0.585 -0.032
[0.027] [0.067]
death 29 0.830 21 0.862 -0.031
[0.029] [0.019]
The value displayed for t-tests are the differences
in the means across the groups.
***, **, and * indicate significance
at the 1, 5, and 10 percent critical level.
. iebaltab marriage divorce death, grpvar(treatment) save("balancetable.xlsx")
上述命令将结果保存在名为 balancetable
的表格中,结果表格如下图所示。
ieddtab
命令可以运行双重差分模型,同时报告基础结果。其语法如下,
ieddtab varlist [if] [in] [weight], time(varname) treatment(varname)
[ covariates(varlist) starlevels(numlist) stardrop
errortype(string) rowlabtype(string) rowlabtext(label_string)
format(%fmt) replace savetex(filepath) onerow nonumbers nonotes
addnotes(string) texdocument texcaption(string) texlabel(string)
texnotewidth(numlist) texvspace(string) ]
其中,varlist
是双重差分模型中的因变量。关于该命令更多详细介绍,请参考 help ieddtab
。ieddtab
工作原理以下命令语句所示,其中 time
为时间变量,treat
为处理组虚拟变量。
generate interact = time * treat //创建一个交互变量
regress varA time treat interact //计算双重差分结果,ieddtab 表中显示交互项的系数
generate sample = e(sample) //指示回归中包含哪些观测值
regress varA time if treat == 0 & sample == 1
regress varA time if treat == 1 & sample == 1
mean varA if time == 0 & treat == 0 & sample == 1
mean varA if time == 0 & treat == 1 & sample == 1
sysuse census.dta, clear
*计算数据的相对比率
replace death = 100 * death / pop
replace marriage = 100 * marriage / pop
replace divorce = 100 * divorce / pop
*随机生成时间和处理组的虚拟变量
gen time = (runiform()<.5)
gen treatment = (runiform()<.5)
. ieddtab death marriage divorce , t(time) treatment(treatment)
(0 observations deleted)
+------------------------------------------------------------------------------+
| | Control | Treatment |Difference-in|
| | Baseline | Difference | Baseline | Difference | -difference |
| | Mean | Coef. | Mean | Coef. | Coef. |
| | (SE) | (SE) | (SE) | (SE) | (SE) |
| Variable | N | N | N | N | N |
|----------+----------+------------+------------+----------------+-------------|
| death | 0.81 | 0.04 | 0.86 | -0.00 | -0.05 |
| | (0.04)| (0.06) | (0.03) | (0.04) | (0.08) |
| | 13 | 29 | 7 | 21 | 50 |
| marriage | 1.16 | -0.10 | 1.04 | 0.91 | 1.01 |
| | (0.07)| (0.08) | (0.07) | (1.36) | (1.13) |
| | 13 | 29 | 7 | 21 | 50 |
| divorce | 0.54 | 0.03 | 0.53 | 0.09 | 0.06 |
| | (0.04)| (0.06) | (0.05) | (0.14) | (0.14) |
| | 13 | 29 | 7 | 21 | 50 |
+------------------------------------------------------------------------------+
The baseline means only include observations not omitted in the 1st
and 2nd differences.The number of observations in the 1st and 2nd
differences includes both baseline and follow-up observations. ***,
**, and * indicate significance at the .01, .05, and .1 percent critical level.
. ieddtab death marriage divorce , t(time) treatment(treatment) ///
> rowlabtext("death Death Rate @@ divorce Divorce Rate") ///
> rowlabtype("varlab")
(0 observations deleted)
+-------------------------------------------------------------------------+
| | Control | Treatment | Difference-in |
| | Baseline | Difference| Baseline |Difference| -difference |
| | Mean | Coef. | Mean | Coef. | Coef. |
| | (SE) | (SE) | (SE) | (SE) | (SE) |
| Variable | N | N | N | N | N |
|------------+----------+-----------+----------+----------+---------------|
|Death Rate | 0.81 | 0.04 | 0.86 | -0.00 | -0.05 |
| | (0.04) | (0.06) | (0.03) | (0.04) | (0.08) |
| | 13 | 29 | 7 | 21 | 50 |
|Number of | 1.16 | -0.10 | 1.04 | 0.91 | 1.01 |
|marriages | (0.07) | (0.08) | (0.07) | (1.36) | (1.13) |
| | 13 | 29 | 7 | 21 | 50 |
|Divorce Rate| 0.54 | 0.03 | 0.53 | 0.09 | 0.06 |
| | (0.04) | (0.06) | (0.05) | (0.14) | (0.14) |
| | 13 | 29 | 7 | 21 | 50 |
+-------------------------------------------------------------------------+
The baseline means only include observations not omitted in the 1st and 2nd
differences. The number of observations in the 1st and 2nd differences
includes both baseline and follow-up observations. ***, **, and * indicate
significance at the .01, .05, and .1 percent critical level.
示例 2 生成的表将具有与示例 1 相同的统计信息,但手动输入了变量 death 和 divorce 的行标题,以及将结婚的行标题设置为变量标签。
. ieddtab death marriage divorce , t(time) treatment(treatment) ///
> rowlabtype("varlab") savetex("DID table.tex") replace
(0 observations deleted)
+-------------------------------------------------------------------+
| | Control | Treatment |Difference-in|
| |Baseline|Difference|Baseline| Difference | -difference |
| | Mean | Coef. | Mean | Coef. | Coef. |
| | (SE) | (SE) | (SE) | (SE) | (SE) |
| Variable | N | N | N | N | N |
|-----------+--------+----------+--------+------------+-------------|
|Number of | 0.81 | 0.04 | 0.86 | -0.00 | -0.05 |
|deaths | (0.04) | (0.06) | (0.03) | (0.04) | (0.08) |
| | 13 | 29 | 7 | 21 | 50 |
|Number of | 1.16 | -0.10 | 1.04 | 0.91 | 1.01 |
|marriages | (0.07) | (0.08) | (0.07) | (1.36) | (1.13) |
| | 13 | 29 | 7 | 21 | 50 |
|Number of | 0.54 | 0.03 | 0.53 | 0.09 | 0.06 |
|divorces | (0.04) | (0.06) | (0.05) | (0.14) | (0.14) |
| | 13 | 29 | 7 | 21 | 50 |
+-------------------------------------------------------------------+
The baseline means only include observations not omitted in the 1st and
2nd differences. The number of observations in the 1st and 2nd differences
includes both baseline and follow-up observations. ***, **, and * indicate
significance at the .01, .05, and .1 percent critical level.
Balance table saved to: DID table.tex
示例 3 具有与示例 1 和示例 2 相同的统计信息,并将行标题设置为变量标签,最后将该表将保存在当前目录下,名称为 "DID table.tex"。
Note:产生如下推文列表的 Stata 命令为:
lianxh 差异 DID, m
安装最新版lianxh
命令:
ssc install lianxh, replace
免费公开课
最新课程-直播课
专题 | 嘉宾 | 直播/回看视频 |
---|---|---|
⭐ 最新专题 | 文本分析、机器学习、效率专题、生存分析等 | |
研究设计 | 连玉君 | 我的特斯拉-实证研究设计,-幻灯片- |
面板模型 | 连玉君 | 动态面板模型,-幻灯片- |
面板模型 | 连玉君 | 直击面板数据模型 [免费公开课,2小时] |
⛳ 课程主页
⛳ 课程主页
关于我们
课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法
等
连享会小程序:扫一扫,看推文,看视频……
扫码加入连享会微信群,提问交流更方便
✏ 连享会-常见问题解答:
✨ https://gitee.com/lianxh/Course/wikis
New!
lianxh
命令发布了:
随时搜索连享会推文、Stata 资源,安装命令如下:
. ssc install lianxh
使用详情参见帮助文件 (有惊喜):
. help lianxh