RDP 2019-08: The Well-meaning Economist 6. Common Bias Corrections are Unnecessary

If we do choose to work with other quasilinear mean types, logical consistency will dictate changes to several aspects of our analysis. For instance, the discussion in Section 5.4 established that, under the same loss function conditions that justify learning about a quasilinear mean, quasi-unbiasedness constitutes optimal centering for estimators. Straight unbiasedness is standard, but not necessarily the right criterion. In this section, the same logic shows that a well-established bias correction for parameter estimates, currently argued to be appropriate for log-linear models, is actually a counterproductive complication to research.

The seminal work in this literature is Goldberger (1968) but a useful place to start is Halvorsen and Palmquist (1980). The Halvorsen and Palmquist paper is about the interpretation of parameters in models of the general form

10 ln( Y i )=a+ m b m Con t m,i + n c n Dumm y n,i + ε i

where the Contm are continuous variables, the Dummyn are dummy variables, and the a, bm, cn are parameters of interest. Halvorsen and Palmquist do not explicitly assign a property to the error term but their primary example – Hanushek and Quigley (1978) – uses OLS. So it is safe to suppose they intend either

11 E[ ε i | X i ]=0and/orE[ ε i X i ]=0

By the same reasoning as the gravity case, the parameters then describe the conditional geometric mean of Y (or a geometric approximation if only the second of the error specifications holds). Since the literature is about bias, I will work only with the first of the error specifications, which implies the second.[16]

At the time it was widely understood that the correct interpretation of 100bm is the percentage change in fitted Y associated with a small change in Contm. That is, it was understood that 1 + bm is the factor change in fitted Y associated with a small change in Contm. But it was also common for researchers to apply the same respective interpretations to 100cn and 1 + cn. Halvorsen and Palmquist show this is incorrect because the dummy variable is dichotomous. Small changes are undefined. So 1 + cn is actually equal to 1 + ln(1 + gn) from[17]

12 Y i =( n ( 1+ g n ) Dumm y n,i )exp( a+ m b m Con t m,i + ε i )

Hence the true factor change in fitted Y associated with the change in Dummyn is

13 1+ g n =exp( c n )

which is the object of interest. Using 1+ c ^ n OLS , Hanushek and Quigley (1978) estimate that black US college graduates earn 1.64 times more than black college dropouts that are otherwise similar. Using exp ( c ^ n OLS ) , Halvorsen and Palmquist (1980) write that the figure should be 1.9 times.

A response by Kennedy (1981), drawing on Goldberger (1968), then argues that the estimator exp ( c ^ n OLS ) is biased for exp(cn), because

14 E[ c ^ n OLS ]=c E[ exp( c ^ n OLS ) ]c

Kennedy (1981) then suggests a bias-corrected estimate for exp(cn), which is lower. Subsequent papers by Giles (1982) and van Garderen and Shah (2002) tried to refine Kennedy's method, but ultimately endorsed his solution. It is now common and found in, for instance, the international consumer and producer price index manuals (International Labour Office et al 2004, p 118; International Labour Organization et al 2004, p 184). Some research ignores the correction because it is considered small. The paper by van Garderen and Shah does show examples in which it is meaningful though.

But why, if we have chosen to learn about the conditional geometric mean of Y, would we subject exp ( c ^ n OLS ) to a test of unbiasedness, which is a criterion of central tendency that is based on the arithmetic mean? By extension of Proposition 4, logical consistency dictates the use of a geometric criteria, i.e. a different type of quasi-unbiasedness.[18]

To illustrate, let Y ^ χ be a prediction for Y, given some χ representing any possible combination of the right-hand side variables in Equation (10). Using Proposition 4, under the same loss function that warrants targeting the conditional geometric mean of Y, it is optimal that the predictions for Y be geometrically unbiased, i.e. that

15 exp( E[ ln( Y ^ χ ) ] )=( n ( 1+ g n ) dumm y n )exp( a+ m b m Con t m )
16 E[ a ^ ]+ m E[ b ^ m ]Con t m + n E[ c ^ n ]Dumm y n =a+ m b m Con t m + n ln( 1+ g n )Dumm y n

This already holds for the naïve OLS method under the stated assumptions (plus some standard regularity conditions). So it is an attractive feature that

17 E[ c ^ n OLS ]=ln( 1+ g n )

Hence it is an attractive feature of OLS that

18 exp( E[ ln( exp( c ^ n OLS ) ) ] ) =( 1+ g n ) =exp( c n )

In other words, if we take the loss function seriously, we must desire exp ( c ^ n OLS ) to be geometrically unbiased for exp(cn), which is already met by the naïve OLS method. Arithmetic unbiasedness is not met, nor is it desired. This finding is similar in spirit to the concept of ‘optimal bias’ in the forecasting literature (see, for instance, Christoffersen and Diebold (1997)).

Footnotes

When only the second is met, OLS produces consistent estimates of the model parameters, but not unbiased ones. [16]

Halvorsen and Palmquist worked with the form 100cn. For my purposes it will be more helpful to set up the problem with the equivalent 1 + cn. [17]

Some primitive versions of these ideas appear in another of my working papers: Gorajek (2018). Updates of that paper will reduce the overlap. [18]