共変数と統計的有意性 - himaginary’s diary

Marc BellemareがMetrics Mondayで以下のようなことを書いている。

Those of us who do applied work for a living will have at some point noticed that, depending on which variables we include in X on the right-hand side (RHS) of an equation like
(1) y = a + bX + cD + e,
the coefficient c on the treatment variable D might go from significant to insignificant or vice versa.
That this is true is the very reason why it is common practice in applied work to present several specifications of equation (1) in the same table, ranging from the most parsimonious (i.e., a regression of y on D alone) to slightly less parsimonious (i.e., a regression of y on D and ever increasing subsets of X) to the least parsimonious (i.e., a regression of y on D and all the controls in X).
...
The issue of what goes on the RHS of equation (1) is getting a lot of attention in the applied literature. Two prominent examples are Emily Oster’s forthcoming JBES article “Unobserved*1 Selection and Coefficient Stability: Theory and Evidence” and Pei, Pischke, and Schwandt’s (2017) NBER working paper titled “Poorly Measured Confounders are More Useful on the Left than on the Right.”
Oster provides a method to assess just how much coefficient (as in coefficient c in equation 1) stability tells us about selection on unobservables. Pei et al. develop a test of identifying assumptions that treats putative additional controls as dependent variables in equation (1).
I expect both methods to become part of the applied econometrician’s toolkit over the next five to 10 years. At the very least, I expect a bare-bone regression of y on D alone to become something that has to be included in a paper, along with a discussion of why the controls that were included on the RHS of equation (1) were retained for analysis.
（拙訳）
応用研究を仕事にしている人は、ある時点で、
　(1) y = a + bX + cD + e
のような方程式の右辺のXにどの変数を入れるか次第で、処置変数Dの係数cが有意から非有意になったり、もしくはその逆になったりすることに気付く。
応用研究において、同じ表で方程式(1)の複数の定式化――最も節約的なもの（＝yのDだけへの回帰）から、次に節約的なもの（＝yのDとXの部分集合への回帰、ただし部分集合は徐々に増やしていく）、最も節約的でないもの（＝yのDとXの全コントロール変数への回帰）に至るまで――を提示するのが一般的な慣行になっているのは、まさにそれが理由である。
・・・
方程式(1)の右辺で何が起きているかは、応用研究分野で多くの注目を集めつつある。2つの傑出した例は、Emily OsterのJBES（Journal of Business & Economic Statistics）掲載予定論文「観測されない選択と係数の安定性：理論と実証」と、 Pei＝Pischke＝Schwandtの「きちんと表されていない交絡変数は右辺よりも左辺で有用」と題された2017年のNBERワーキングペーパーである。
Osterは、（方程式(1)の係数cのような）係数の安定性が、観測されない変数の選択についてどれほどのことを語ってくれるかを評価する手法を提供している。Peiらは、追加予定のコントロール変数を方程式(1)の従属変数として扱う、識別の仮定の検証手法を開発した。
今後5年ないし10年以内に、両手法が応用計量経済学者の道具の一部になることを期待したい。少なくとも、yをDだけに回帰する最小限の回帰式が、方程式(1)の右辺に含まれるコントロール変数が分析で保持されている理由についての議論とともに、論文に必須のものとなることを期待する。

以下はOster論文の要旨。

A common approach to evaluating robustness to omitted variable bias is to observe coefficient movements after inclusion of controls. This is informative only if selection on observables is informative about selection on unobservables. Although this link is known in theory in existing literature, very few empirical articles approach this formally. I develop an extension of the theory that connects bias explicitly to coefficient stability. I show that it is necessary to take into account coefficient and R-squared movements. I develop a formal bounding argument. I show two validation exercises and discuss application to the economics literature. Supplementary materials for this article are available online.
（拙訳）
除外変数バイアスに対する頑健性を評価する一般的な方法は、コントロール変数を含めた後の係数の変化を観測することである。これが情報をもたらすのは、観測変数の選択が観測されない変数の選択について情報を含んでいる場合だけである。既存の研究ではその関係が理論上は知られていたが、その方法を正式に用いた実証論文はほとんど存在しない。本稿では、理論を拡張し、バイアスを明示的に係数の安定性に結び付けた。また、係数と決定係数の動きを考慮する必要があることを示す。さらに、境界条件に関する正式な議論を展開する。2つの確認例を示すとともに、経済学研究への応用について論じる。本稿の補助的な資料は、オンラインで入手可能である。

以下はPeiらの論文の要旨。

Researchers frequently test identifying assumptions in regression based research designs (which include instrumental variables or difference-in-differences models) by adding additional control variables on the right hand side of the regression. If such additions do not affect the coefficient of interest (much) a study is presumed to be reliable. We caution that such invariance may result from the fact that the observed variables used in such robustness checks are often poor measures of the potential underlying confounders. In this case, a more powerful test of the identifying assumption is to put the variable on the left hand side of the candidate regression. We provide derivations for the estimators and test statistics involved, as well as power calculations, which can help applied researchers interpret their findings. We illustrate these results in the context of various strategies which have been suggested to identify the returns to schooling.
（拙訳）
研究者は、回帰の右辺に追加的なコントロール変数を加えることにより、回帰に基礎を置く研究設計（操作変数モデルや差の差モデルも含む）における識別の仮定を検証することが多い。そうした追加が、研究対象変数の係数に（あまり）影響しなければ、研究は信頼できるものとされる。我々は、その頑健性チェックで使われる観測される変数が、背後に隠れた交絡変数をきちんと表していないことが多い、という事実からもそうした不変性は生じ得る、と警告する。その場合、識別の仮定のより強力な検証は、検討している回帰の左辺にその変数を置くことである。我々は、推定量および関係する検定統計量、ならびに検出力計算の導出を提示する。それらは応用研究者が自らの発見を解釈するのに役立つ。我々は、学校教育のリターンを識別するために提案された様々な戦略を例にとって、以上の結果を説明する。

Bellemareはこのほか、Gabriel LenzとAlexander Sahnの「Achieving Statistical Significance with Covariates」という論文も紹介している。以下はその要旨。

An important and understudied area of hidden researcher discretion is the use of covariates. Researchers choose which covariates to include in statistical models and these choices affect the size and statistical significance of estimates reported in studies. How often does the statistical significance of published findings depend on these discretionary choices? The main hurdle to studying this problem is that researchers never know the true model and can always make a case that their choices are most plausible, closest to the true data generating process, or most likely to rule out alternative explanations. We attempt to surmount this hurdle through a meta-analysis of articles published in the American Journal of Political Science (AJPS). In almost 40% of observational studies, we find that researchers achieve conventional levels of statistical significance through covariate adjustments. Although that discretion may be justified, researchers almost never disclose or justify it.
（拙訳）
研究者の隠れた裁量についての重要かつあまり研究されていない領域は、共変数の使用である。研究者はどの共変数を統計モデルに入れるかを決め、その選択が、研究結果として報告される推定量の大きさと統計的有意性に影響する。出版された結果の統計的有意性は、どの程度の頻度でそうした裁量的な選択に左右されるのだろうか？　この問題を研究する上での最大の障壁は、研究者は決して真のモデルを知ることはできず、自分たちの選択が最も説得力があり、真のデータ生成過程に最も近い、もしくは、他の説明を除外する可能性が最も高い、と主張することが常に可能である、という点にある。我々は、アメリカン・ジャーナル・オブ・ポリティカル・サイエンス（AJPS）に掲載された論文のメタ分析を通じて、この障壁の克服を試みる。観測値を用いた研究の4割近くで、研究者が共変数の調整を通じて統計的有意性の通常の水準を達成していることを我々は見い出した。そうした裁量は正当化されるのかもしれないが、研究者は開示ないし正当化をほぼしていなかった。

なお、後続エントリでBellemareは、インディアナ大のDan Sacksから以下のような指摘が寄せられたことを紹介している。

The basic issue is that it seems fine to me if the precision of your coefficient is sensitive to the inclusion of pre-determined covariates, as long as the expected value is not. That is, in such cases it seems fine to emphasize the precisely estimated result.
...the estimated coefficient c on D might or might be statistically significant, depending on what is included in the control vector X. The usual concern in the applied literature—which of course I share completely—is that if we don’t condition on a sufficient set of confounders, then c is estimated with bias. We all want to avoid bias. Bias is about expected values, though, not statistical significance, and it is not obvious to me that we should be worried about models in which including covariates changes the statistical significance (but not the expected value) of the results. Including pre-determined regressors which are uncorrelated with D but (conditionally) correlated with Y will generally reduce var(e), reducing the standard error of c and possibly leading to statistical significance. The fact that our results are only significant if we control for some set of X’s does not necessarily mean that there is bias – only that we might be underpowered without enough controls.
（拙訳）
基本的な話として、係数の正確性が規定の共変数を含むことに敏感なのは、期待値が敏感でない限り、別に良いのではないか、と私は思う。つまり、そうした場合に、正確に推計された結果を強調するのは別に構わないのではないか、と思う。
・・・推計されたDの係数cは統計的に有意かもしれないし、非有意かもしれない。それはコントロールのベクトルXに何が含まれているかによる。応用研究で通常問題になるのは――そしてその問題意識を私は完全に共有しているが――交絡変数を十分に用意しないと、cがバイアスをもって推計されてしまう、ということである。我々は皆バイアスを避けたいと思っている。しかしバイアスは期待値についての話であり、統計的有意性についての話ではない。共変数を含むことが結果の統計的有意性を変える（が期待値は変えない）モデルについて我々が懸念すべきかどうかは私には明らかではない。Dと相関を持たないが（条件付きで）Yと相関を持つ規定の説明変数を入れることは、一般にvar(e)を減らし、cの標準誤差を減らして統計的有意性をもたらす可能性がある。ある種のXの組み合わせのコントロールでのみ結果が有意になるということは、必ずしもバイアスが存在することを意味しない。単に、十分なコントロール抜きでは検定力が弱くなるかもしれない、というだけのことである。

それに対しBellemareは、Ganzらの論文は正確性に関わるものだった一方、OsterやPeiの論文は期待値に関わるものだったので、両者は区別して論じるべきだった、と述べている。その上で、Ganzらの論文は統計的有意性のためのデータマイニングの話だ、と強調し、Sacksが例に出した*2RCTでは観測データほどデータマイニングが問題にならない、と指摘している。

*1:正しくはUnobservable。

*2:上記引用部では省略。