# Stata 手动：各类匹配方法大全 A——理论篇

Stata手动：各类匹配方法大全 A——理论篇

Stata 连享会   主页 || 视频 || 推文

## 1. 单变量匹配 uni-variate match

### 1.3 k-近邻匹配 k-nearest neighbor match

k-近邻匹配要求匹配变量是距离，它选取距离最近的 k 个观测值作为对照组。

### 1.4 radius (caliper) match

k-近邻匹配要求匹配变量也是距离，它事先设定半径 (上下半径可以不同)，找出设定范围内的全部观测值作为对照组。显然，随着半径的降低，匹配要求也更趋严格。

## 3. 代码展示

### 3.1 数据生成过程

``````clear
set obs 200
gen id = _n
set seed 10000
gen r0 = rnormal()
set seed 12345
gen r1 = rnormal()
set seed 65432
gen r2 = rnormal()
set seed 10101
gen re = rnormal()

*-选择变量
gen x1 = 2 * (r0 + r1)
gen x2 = 3 + (r0 + r2)

*-选择机制
gen group = 2 * x1 - 3 * x2 + 10 * re > 0
label def group 0 "非对照组" 1 "处理组"
label val group group

*-数据分布
keep id x1 x2 group
tab group
correlate x1 x2
twoway (scatter x1 x2 if group == 0, mcolor(black))	///
(scatter x1 x2 if group == 1, mcolor(red)), 	///
title("数据分布") xlabel(-10(5)10) ///
ylabel(-10(5)10, nogrid) ///
aspect(1) legend(off)
``````

### 3.2 欧氏距离匹配的结果

``````gen ec = .
gen ed = .
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = sqrt((x1[`i'] - x1[`j'])^2 + (x2[`i'] - x2[`j'])^2)
if `d' < ed[`i'] {
qui replace ec = `j' in `i'
qui replace ed = `d' in `i'
}
}
}
}
}
tab group
misstable sum ec ed
``````

### 3.3 马氏距离匹配的结果

``````corr x1 x2, cov
mat cov = r(C)
mat cov = inv(cov)
mat list cov
scalar a_11 = cov[1, 1]
scalar a_22 = cov[2, 2]
scalar a_21 = cov[2, 1]
gen mc = .
gen md = .
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = sqrt(a_11*(x1[`i'] - x1[`j'])^2 + a_22*(x2[`i'] - x2[`j'])^2 + 2 * a_21 * (x1[`i'] - x1[`j']))
if `d' < md[`i'] {
qui replace mc = `j' in `i'
qui replace md = `d' in `i'
}
}
}
}
}
``````

### 3.4 倾向得分匹配的结果

``````gen pc = .
gen pd = .
probit group x1 x2
predict score, xb
replace score = normal(score)
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = abs(score[`i'] - score[`j'])
if `d' < pd[`i'] {
qui replace pc = `j' in `i'
qui replace pd = `d' in `i'
}
}
}
}
}
``````

## 4. 参考文献和资料

• Mahalanobis Distance: Simple Definition, Examples link
• Mahalanobis, P. C. (1936). On the generalized distance in statistics. National Institute of Science of India. pdf
• Percentiles, Percentile Rank & Percentile Range: Definition & Examples link
• Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. pdf

## 附：文中使用的 dofiles

``````*-3.1 数据生成过程

clear
set obs 200
gen id = _n
set seed 10000
gen r0 = rnormal()
set seed 12345
gen r1 = rnormal()
set seed 65432
gen r2 = rnormal()
set seed 10101
gen re = rnormal()

*-选择变量
gen x1 = 2 * (r0 + r1)
gen x2 = 3 + (r0 + r2)

*-选择机制
gen group = 2 * x1 - 3 * x2 + 10 * re > 0
label def group 0 "非对照组" 1 "处理组"
label val group group

*-数据分布
keep id x1 x2 group
tab group
correlate x1 x2
twoway (scatter x1 x2 if group == 0, mcolor(black))	///
(scatter x1 x2 if group == 1, mcolor(red)), 	///
title("数据分布") xlabel(-10(5)10) ///
ylabel(-10(5)10, nogrid) ///
aspect(1) legend(off)

*-3.2 欧氏距离匹配的结果

gen ec = .
gen ed = .
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = sqrt((x1[`i'] - x1[`j'])^2 + (x2[`i'] - x2[`j'])^2)
if `d' < ed[`i'] {
qui replace ec = `j' in `i'
qui replace ed = `d' in `i'
}
}
}
}
}
tab group
misstable sum ec ed

*-3.3 马氏距离匹配的结果

corr x1 x2, cov
mat cov = r(C)
mat cov = inv(cov)
mat list cov
scalar a_11 = cov[1, 1]
scalar a_22 = cov[2, 2]
scalar a_21 = cov[2, 1]
gen mc = .
gen md = .
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = sqrt(a_11*(x1[`i'] - x1[`j'])^2 + a_22*(x2[`i'] - x2[`j'])^2 + 2 * a_21 * (x1[`i'] - x1[`j']))
if `d' < md[`i'] {
qui replace mc = `j' in `i'
qui replace md = `d' in `i'
}
}
}
}
}

*-3.4 倾向得分匹配的结果

gen pc = .
gen pd = .
probit group x1 x2
predict score, xb
replace score = normal(score)
forvalues i = 1(1)200 {
if group[`i'] == 1 {
forvalues j = 1(1)200 {
if group[`j'] == 0 {
local d = abs(score[`i'] - score[`j'])
if `d' < pd[`i'] {
qui replace pc = `j' in `i'
qui replace pd = `d' in `i'
}
}
}
}
}
``````

## 相关课程

http://lianxh.duanshu.com

### 课程一览

Stata数据清洗 游万海 直播, 2 小时，已上线

Note: 部分课程的资料，PPT 等可以前往 连享会-直播课 主页查看，下载。

#### 关于我们

• Stata连享会 由中山大学连玉君老师团队创办，定期分享实证分析经验。直播间 有很多视频课程，可以随时观看。
• 连享会-主页知乎专栏，300+ 推文，实证分析不再抓狂。
• 公众号推文分类： 计量专题 | 分类推文 | 资源工具。推文分成 内生性 | 空间计量 | 时序面板 | 结果输出 | 交乘调节 五类，主流方法介绍一目了然：DID, RDD, IV, GMM, FE, Probit 等。
• 公众号关键词搜索/回复 功能已经上线。大家可以在公众号左下角点击键盘图标，输入简要关键词，以便快速呈现历史推文，获取工具软件和数据下载。常见关键词：`课程, 直播, 视频, 客服, 模型设定, 研究设计, stata, plus, 绘图, 编程, 面板, 论文重现, 可视化, RDD, DID, PSM, 合成控制法`

✏ 连享会学习群-常见问题解答汇总：
https://gitee.com/arlionn/WD