Angus Deaton recently said that all the attention that natural (and actual) experiments are getting is over blown. He claims experimental data has no special status in a hierarchy of evidence. I agree to the extent that I don’t think we should favor one form of evidence to the exclusion of other types of evidence ((I tenured member of the Cult of Identification told me once that she wouldn’t write a paper about a topic unless there was a clear source of exogenous variation. She proudly told me that she hadn’t used an instrumental variable in years.)). Evidence is evidence.
A readily available form of evidence about the relationship between native employment opportunities and immigration is cross-section data ((In applied micro seminars, you often hear Cult members hiss something to the effect, “But those estimates are from cross-sectional data”. With grimaces around the table at the mention of the taint.)). These data describe various geographical regions or worker skill groups. For each region or skill group, the analysts assigns average wages (or other employment outcome) and the percentage of the group that is immigrants. Then the analyst checks to see if there’s a correlation among the groups between wages and the number of immigrants.
As you can imagine, there’s a lot for the interested analyst to play with. Every country has its own data sources. You can change the definition of skill group. You can look at larger geographic regions like states or smaller ones like cities. And, like always, you can choose from the palette of statistical techniques to calculate your estimated correlation and effect size. Longhi, Nijkamp and Poot did a meta-analysis of 18 papers that reported 348 estimates of this correlation.
As a quick demonstration of what these papers look like, I’ve downloaded some Census 2000 data from IPUMS USA. For each state, I calculated the percent of workers that are foreign born and the average wage for native workers. Here’s the plot:
I’ve drawn the regression line. Surprisingly, the line has an upward slope, suggesting a positive correlation. The slope of the line is about 1.5.
One thing that’s wrong with this plot, besides the fact that I haven’t controlled for a bunch of obvious things, is that this simple correlation conflates the impact of immigration on native wages with the shared economic incentives of natives and immigrants to move to states that have positive wage growth. Both immigrants and natives will want to move to states that have good wage prospects; they select themselves, to use the jargon. We really only care about the first thing, the impact of immigrants on natives, and so we’d like to wash this correlation to get the stain of “selection” out.
A neat regularity among immigrants is that they tend to move to regions where previous immigrants had already called home. We’ll leave it to sociologists to tell us why this might be the case and for the moment just exploit this fact for our statistical purposes. We can predict the percentage of immigrants in a state in the year 2000 by looking at the percentage of immigrants in that state several years before. Here’s a plot:
The red line is the regression line and the black line is the 45 degree line. As you can see, the percentage of immigrants has uniformly increased in those 40 years, but the red line is positively sloped and the dots cluster pretty well around the regression line. The immigrate ratios in 2000 are predicted pretty well by their ratios in 1960(!).
So what? Well, suppose the percentage of immigrants in a state does not have an impact on the relative wage prospects in that state 40 years later. The prediction of the year 2000 immigration ratios using the red line, then, should be unrelated to the wage prospects for immigrants (and natives) in that year. This prediction is just the detergent we needed to get rid of the stain of selection. Basically, we’re taking the variation of immigrant ratios due to selection out and only looking at the variation due to immigrant clustering. Here’s a plot of native wages versus predicted year 2000 immigrant ratios:
The slope on the regression line is 1.8. That this slope is close to the slope of the one where I didn’t correct for selection suggests that selection isn’t that big of a deal.
While its size is a bit big and so makes me think I did something wrong, the sign of the slope I’ve estimated isn’t surprising. Longhi, Nijkamp and Poot found that almost as many estimates of the effect of immigration on native wages are positive as negative. Here’s their figure 1 which shows the distribution of estimates across analyses:
The estimates seem to cluster around zero. My estimates are 1.5 standard deviations away from the mean; not too bad for a quick and dirty analysis!
So even the non-experimental evidence suggests immigrants have little impact on native wages.