Here’s why that’s complete bullocks, as they say. Let’s say you have a survey that asks people whether or not they have kids (K) and asks people to rate on a scale from 1-10 their satisfaction with their marriage (M*) and on the same scale ask their overall happiness (H). Of course these survey instruments are more sophisticated then that, but the point is you have two types of questions, ones with a lot of measurement error (and actually is just a proxy) and one that has little or no measurement error.
Now, people can report whether or not they have kids with tremendous accuracy and precision. We’re great “kid counting” instruments. We’re lousy at measuring our own satisfaction (M), though. We’re inaccurate at measuring our satisfaction because its not clear what scale we should be using. We’re answering “satisfied – 8” on the wrong 1-10 scale when we should be answering “satisfied – 132” on the negative 17 to infinity scale.
Compounding the problem is that having kids probably changes our measurement of our own satisfaction with marriage. By this, I don’t mean that there’s necessarily a correlation between true satisfaction and having kids. I mean our measurement of satisfaction is correlated with having kids.
We’re also not precise instruments of satisfaction measurement. We would answer “132” on the correct satisfaction scale but we can’t really differentiate between a 132 or a 133. In fact, we probably have a whole range of satisfaction scores that we wouldn’t be able to differentiate between.
So what? Well, the problem is that this means the satisfaction with marriage score we get from surveys is a proxy for a true measure of satisfaction and it is correlated with having kids. Proxies correlated with other regressors introduce biases that make it more likely to find kids make us less happy in general even though in reality this isn’t true.
UPDATE: I think my post as previously written was confusing where proxy/measurement error comes into play. The original study regressed overall happiness on having kids and some other subjective, poorly measured variables (like happiness in marriage). There are two problems with the measurement of those subjective variables. The first is that the measure itself may be correlated with having kids. People with kids may believe kids makes marriages more happy. The second problem is standard measurement error (i.e. lack of precision and accuracy). I’ve rewritten a little in hopes of improving clarity.
UPDATE 2: Take 2. I replaced “happiness in marriage” with “marriage satisfaction” so as not to confuse that with overall happiness. YouNotSneaky! is so much better at this…
The math below the fold proves these point.
Let the variables be defined as above with overall happiness H. So true happiness in marriage is M, measured happiness in marriage M* and having kids or not K.
Alpha_1 is a measure of the accuracy of the happiness measure and epsilon is a measure of its precision. Alpha_2 is the degree to which the number of kids is correlated with the measure of happiness. None of these things are observable so you have to make guesses at their sign and magnitude.
Beta_1 is the true weight of kids on happiness. The claim in the research Wilkinson cites is that this is negative. Beta_2 is the weight of happiness on the squishy variable (happiness in marriage in my example).
Equation (4) is the result of substituting the measured happiness into the actual (and unobserved) relationship. We see (among other things) that the bigger the true effect of marriage happiness, the more negative the effect of kids becomes. This is due to the correlation between kids and the measurement of happiness in marriage.
Also, equation (4) shows us that more variation than is due will be assigned to the more accurately measured variable. The alpha_1 in the denominator of the coefficient in front of marriage happiness shows us this. Also, the messed up error term (its smaller than it should be because we subtract a bit off of it) tells us the tests for statistical hypotheses won’t be rejected as often as they should be.
In net, we should be wary of results that compare easily measured quantities to less easily measured quantities. Its possible, via analysis like the above, to show the measurement biases may stack too much in a tested hypothesis’ favor.
BTW, this analysis applies to results on income and happiness, too. Income will be biased negatively, i.e. its importance will be under-weighted. I haven’t read Wolfers, but I bet he’s taken this issue into account.