【Stata】 Oster检验结果排版及描述

稳健性之遗漏变量检验

1 基本原理

  • 通过设定可达到的最大 $R^2 (Rmax)$ 和未观测变量的相对重要性 ($\delta $)来估计遗漏变量可能造成的影响,并判断处理效应估计值的稳定性。

  • 根据Oster (2019)的建议,我们假设$Rmax$是基准回归模型中$R^2$的1.3倍,而未观测变量对因变量的影响至少与已观测变量同等重要,即$\delta = 1$,进一步采用估计量$\beta^\star =\beta^\star (Rmax, \delta )$得到真实系数的一致估计。

2 操作代码

1
2
3
4
5
6
7
8
9
10
//计算Uncontrolled结果
reghdfe Y X, noa vce(cl ind3#year) level(99.5)

//计算Controled结果与置信区间
reghdfe Y X $controls ,a( ind3#year ) vce(cl ind3#year) level(99.5)

//计算true beta与beta为0时的delta
local r = e(r2) * 1.3
psacalc2 beta X, delta(1) rmax(`r')
psacalc2 delta X, beta(0) rmax(`r')

3 结果排版

Dependent variable Y
Uncontrolled coefficients -0.0130
Uncontrolled R² 0.001
Controled coefficients 0.0073***
Controled R² 0.405
99.5% Confidence interval [-0.0005,0.0152]
Control variables YES
Industry-Year fixed effect YES
“True” $\beta$ 0.0192
$\delta$ for $\beta$ = 0 -0.8627

按顺序需要手动录入以下数据:

  • Uncontrolled下的X回归系数与R-squared
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
. reghdfe Y X, noa vce(cl ind3#year) level(99.5)
(MWFE estimator converged in 1 iterations)

HDFE Linear regression Number of obs = 16,752
Absorbing 1 HDFE group F( 1, 847) = 2.47
Statistics robust to heteroskedasticity Prob > F = 0.1167
R-squared = 0.0013
Adj R-squared = 0.0012
Within R-sq. = 0.0013
Number of clusters (ind3#year) = 848 Root MSE = 6.2419

(Std. err. adjusted for 848 clusters in ind3#year)
------------------------------------------------------------------------------
| Robust
Y | Coefficient std. err. t P>|t| [99.5% conf. interval]
-------------+----------------------------------------------------------------
X | -.0130469 .0083091 -1.57 0.117 -.0364321 .0103383
_cons | 8.291464 .2427002 34.16 0.000 7.608406 8.974521
------------------------------------------------------------------------------
  • Controled下的X回归系数、R-squared与置信区间
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
. reghdfe Y X $controls ,a( ind3#year ) vce(cl ind3#year) level(99.5)
(MWFE estimator converged in 1 iterations)

HDFE Linear regression Number of obs = 16,752
Absorbing 1 HDFE group F( 13, 847) = 46.08
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.4049
Adj R-squared = 0.3727
Within R-sq. = 0.0814
Number of clusters (ind3#year) = 848 Root MSE = 4.9466

(Std. err. adjusted for 848 clusters in ind3#year)
------------------------------------------------------------------------------
| Robust
Y | Coefficient std. err. t P>|t| [99.5% conf. interval]
-------------+----------------------------------------------------------------
X | .0073397 .0027781 2.64 0.008 -.000479 .0151585
Lev | -.1551993 .2676344 -0.58 0.562 -.9084319 .5980334
firmage | -.0206417 .1378441 -0.15 0.881 -.4085913 .3673078
Size | 1.397631 .061951 22.56 0.000 1.223276 1.571987
sfee | 1.895896 .5163247 3.67 0.000 .4427476 3.349045
Dual | .1007562 .0789137 1.28 0.202 -.1213394 .3228517
Top1 | -.012556 .0028019 -4.48 0.000 -.0204416 -.0046704
Indep | -.0170681 .0075772 -2.25 0.025 -.0383933 .0042572
occupy | -2.832035 2.223062 -1.27 0.203 -9.08864 3.42457
Cash | 1.180776 .4373728 2.70 0.007 -.0501698 2.411722
Soe | -1.301147 .1252125 -10.39 0.000 -1.653546 -.9487476
Gdp | .6944393 .1217796 5.70 0.000 .3517018 1.037177
People | -.635223 .1217548 -5.22 0.000 -.9778907 -.2925554
_cons | -3.086463 .9588209 -3.22 0.001 -5.784976 -.387949
------------------------------------------------------------------------------
  • Estimate出的true Beta
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
 psacalc2 beta X, delta(1) rmax(`r')

---- Treatment Effect Estimate ----
| Estimate Sq. difference Bias changes
| from controlled beta direction
-------------+----------------------------------------------------------------
Beta | 0.01920 .000141
Alt. sol. 1 | -2.18561 4.81 Yes
Alt. sol. 2 |
-------------+----------------------------------------------------------------

---- Inputs from Regressions ----
| Coeff. R-Squared
-------------+----------------------------------------------------------------
Uncontrolled | -0.01305 0.001
Controlled | 0.00734 0.405
-------------+----------------------------------------------------------------

---- Other Inputs ----
-------------+----------------------------------------------------------------
R_max | 0.526
Delta | 1.000
Unr. Controls|
-------------+----------------------------------------------------------------
  • Estimate出的delta
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
 psacalc2 delta X, beta(0) rmax(`r')

---- Bound Estimate ----
-------------+----------------------------------------------------------------
delta | -0.86273
-------------+----------------------------------------------------------------

---- Inputs from Regressions ----
| Coeff. R-Squared
-------------+----------------------------------------------------------------
Uncontrolled | -0.01305 0.001
Controlled | 0.00734 0.405
-------------+----------------------------------------------------------------

---- Other Inputs ----
-------------+----------------------------------------------------------------
R_max | 0.526
Beta | 0.000000
Unr. Controls|
-------------+----------------------------------------------------------------

4 结果判断与描述

oster有多种判断方法,满足其一就可以:

  • true $\beta$ 与基准回归系数相比,绝对值更大,且符号相同
    • 表明X对Y的真实影响可能大于基准模型中的回归系数,且不会出现符号相反的情况
  • $\delta$ 大于1或小于0
    • 大于1表明:不可观测变量的重要性需要是可观测变量的 $\delta$ 倍时,才会产生零处理效果,这种情况较难成立,验证了系数的稳定性
    • 小于0表明:经偏差调整后的估计系数应大于基准回归中得到的系数
  • true $\beta$ 落在置信区间内
  • true $\beta$ 与Controled coefficients(基准回归中X的系数)形成的区间落在置信区间内
    • 说明并不存在与本文解释变量同等重要的遗漏变量影响本文结论
  • true $\beta$ 与Controled coefficients(基准回归中X的系数)构成的区间不包含0
    • 说明遗漏变量不太可能将处理效应的真实值推至0

作话:mark住下次不要再忘了…

作者

LxxCandy

发布于

2024-05-20

更新于

2024-06-14

许可协议

评论