There are a lot of things that drive me crazy about the current practice of econometrics. People who think over-identification tests validate their indentifying*1 assumptions. People who think that if you fail to reject the null at the 0.05 level, it's fine to proceed in your analysis as if the null was true (i.e. people who don't believe in type II error).

But one of the biggest is the practice of thinking we do no harm by using estimators we know to be inappropriate for the data at hand and thinking we somehow fully fix that issue by using robust standard errors.

I annually beat my head against the wall trying to get my students to appreciate these issues (only to often have my work undone by their reading papers/books that make these mistakes), but now on this last point, I have some help!



“Robust standard errors” are used in a vast array of scholarship across all fields of empirical political science and most other social science disciplines. The popularity of this procedure stems from the fact that estimators of certain quantities in some models can be consistently estimated even under particular types of misspecification; and although classical standard errors are inconsistent in these situations, robust standard errors can sometimes be consistent. However, in applications where misspecification is bad enough to make classical and robust standard errors diverge, assuming that misspecification is nevertheless not so bad as to bias everything else requires considerable optimism. And even if the optimism is warranted, we show that settling for a misspecified model (even with robust standard errors) can be a big mistake, in that all but a few quantities of interest will be impossible to estimate (or simulate) from the model without bias. We suggest a different practice: Recognize that differences between robust and classical standard errors are like canaries in the coal mine, providing clear indications that your model is misspecified and your inferences are likely biased. At that point, it is often straightforward to use some of the numerous and venerable model checking diagnostics to locate the source of the problem, and then modern approaches to choosing a better model. With a variety of real examples, we demonstrate that following these procedures can drastically reduce biases, improve statistical inferences, and change substantive conclusions.


An earlier generation of econometricians corrected the heteroskedasticity problems with weighted least squares using weights suggested by an explicit hetero- skedasticity model. These earlier econometricians understood that reweighting the observations can have dramatic effects on the actual estimates, but they treated the effect on the standard errors as a secondary matter. A “robust standard” error completely turns this around, leaving the estimates the same but changing the size of the confifidence interval. Why should one worry about the length of the confifidence interval, but not the location? This mistaken advice relies on asymp- totic properties of estimators.5 I call it “White-washing.” Best to remember that no matter how far we travel, we remain always in the Land of the Finite Sample, infinitely far from Asymptopia. Rather than mathematical musings about life in Asymptopia, we should be doing the hard work of modeling the heteroskedasticity and the time dependence to determine if sensible reweighting of the observations materially changes the locations of the estimates of interest as well as the widths of the confidence intervals.
計量経済学者の以前の世代は、明示的に分散不均一なモデルから得られるウェイトを用いた加重最小二乗法によって分散不均一性の問題に対処した。彼らは、対象データのウェイトを変更することが実際の推定値に大きな影響を与えることを理解していたが、標準誤差に与える影響は二次的なものとして取り扱っていた。「頑健標準」誤差はこれを完全に逆転させ、推定値は変えない半面、信頼区間の大きさを変えた。信頼区間がどこに位置するかを気にせずに、大きさを気にすることにどんな意味がある? この誤った手法は、推定値の漸近的な特性に依存していた。私はこれを「ホワイト洗浄」と呼ぶ。如何に我々が遠くを旅しようとも、漸近世界からは無限に遠く離れた有限サンプルの世界に留まっていることは肝に銘ずべきである。漸近世界について数学的思いを巡らせるよりは、分散不均一性や時間依存性のモデル化に勤しみ、分析対象の推定値の位置と信頼区間の幅が、然るべきウェイトの変更によって大きく変わるかどうかを確認すべきなのだ。
