# Beautiful evidence

One of my favorite authors on the visual display of information is Edward Tufte. Besides despensing great advice on presenting data, he’s produced a series of beautiful books. They’re really picture books for the data nerd set.

I think he’d like this diagram:

The only thing better would be to figure out how to plot the whole density for each year. The black lines give preference for the selected percentiles, but there’s no a priori reason to think the 99 percentile is some how more important than the 96 percentile. Because this bias is built in, the diagram is a little misleading.

But I like it, nonetheless.

If you read through the comments at Prof. Kenworthy’s post, you’ll see many people committing the fallacy of assuming the same individual is represented by a single percentile line on the chart. As I’ve pointed out before, people do not stay within the same percentile (or decile or quintile, etc) over their lives. The churn of poverty is very high. Before age 75, for example, 50% of Americans will have spent some time below the poverty line and 50% spent some time at level of income ten times the poverty line. Only about 20% of Americans won’t experience one of these extremes.

UPDATE: Yes, Nancy, that means 30% of Americans will have been rich and poor in their lifetimes.

UPDATE 2: I swear to all that is holy that I titled this post before reading the post over at CT.

UPDATE 3: These Sala-i-martin (2002) charts are an improvement, except for the log scale, on the Kenworthy ones:

## 5 thoughts on “Beautiful evidence”

1. swong says:

That data is begging for a TED-style treatment. Gimme something on a web site where you can show 3 or 4 dimensions at the same time.

Might be handy if you want to help convince some people that there aren’t billions* of poor Americans trapped in shanty towns**.

*Yes, billions.
**Like Detroit.

2. I agree that looking at the whole distribution would be an improvement and Swong gave you the answer.

It shouldn’t be too hard, given that one already has the data, to show the distribution for each year. Just put your regular normal smoothed density on two of the axis (or a histogram if you prefer) and put time on the third. Sounds like a really neat idea.

However, if you just want to compare two points in time, which you might, then you could just plot the two distributions on the same x-axis so you don’t have to deal with weird 3-d perspective issues.

3. Can people read those 3d graphs though?

Oh, you’re second idea is what was done by Sal-i-martin in his world income distribution paper. I forgot all about that. It’s pretty readable. I’ll add it as an update to the post.

4. swong says:

One of the first TED presentations I saw had a presenter showing something like life expectancy and fertility rates among a big group of countries on a standard 2-axis graph. As he rolled time forward, the points slid around the graph, making it really easy to spot long-term trends.

What’s the deal with the weird spots around 9 and 11 on the US graph?

5. “What’s the deal with the weird spots around 9 and 11 on the US graph?”

I suppose your question is proof of superiority of these pictures over the Kenworthy ones.