This is pretty funny. Or horrifying. Depends on how you want to look at it.

Several days ago, I noted on Twitter that there were a lot of “saved” jobs that weren’t saved at all but actually cost of living increases. About 24 hours after I noted this, there was an Associated Press article about that very phenomena.

Coincidence? Almost certainly. But I’ll flatter myself anyway.

But the laugh riot comes several paragraphs into the article as they look into why Southwest Georgia Community Action Council was able to save 935 jobs with a cost of living increase for only 508 people. The director of the action council said:

“she followed the guidelines the Obama administration provided. She said she multiplied the 508 employees by 1.84 — the percentage pay raise they received — and came up with 935 jobs saved.

“I would say it’s confusing at best,” she said. “But we followed the instructions we were given.”

“Confusing at best”? The multiplication of percentages is “confusing at best”? It seems obvious to me she should have multiplied 508 people by the amount the increase (.0184) and gotten 9.3. But she forgot that you have to divide the percentage by 100 before you multiply.

The fact that she had “saved” more jobs than there were people in the organization should have been a tip-off. But this is a pretty common problem with people who don’t have a very good grasp on mathematics… they don’t recognize obvious mathematical errors, they just plug in the numbers and go with whatever comes out.

And this, children, is why you pay attention at school. So you don’t get in the national news for doing something really stupid and then blame it on the instruction manual.

I found this link from Instapundit, so credit where it is due.

You may have seen this visual of job loss across the country. It maps the job gains and losses in major metro areas across the country and, on the surface it seems pretty cool. Here’s October 2008.


As someone who really loves information visualization, I applaud the effort. But it’s wrong.

Let’s take a quick look at the legend. See if you can spot the problem.


Keen readers will notice the problem… whoever created this visual scaled only the diameter of the circle. The problem with this is what we can see below.


Here I took the “10,000” circle and duplicated it over 50 times within the “100,000” circle. If this visual were an accurate one, we would multiply the 10,000 circle ten times to get 100,000. That’s just the way these things should work.

Math Time! (skip if you don’t care)

The area of a circle is calculated with the equation:


Which means that when they increase the height of the circle by 10, they increase it’s area by 100. This means that instead of the numbers increasing the way they should, the small numbers end up looking REALLY small and the big ones look absurdly huge.

End of Math Time

I’m not trying to be an a**hole here. The idea behind the visual was a good one. But these things really do need to be accurate. Most people don’t know how to tell when a visual is in error and they end up with an incorrect impression from a poorly built infographic.

Space Junk And Visual Lies

October 14, 2009

A little while back, due to a collision between a dead Russian military satellite and a US commercial satellite, there was some noise about space junk because of the potential danger it posed to the International Space Stations and the Shuttle. The image that of space junk that became the icon of the problem is this image (click to enlarge):


I hate this image. Passionately.

The reason I hate this image is because it is probably the biggest visual lie I’ve ever seen. In his book The Visual Display of Quantitative Information, Edward Tufte has a concept called the “Lie Factor”. The “Lie Factor” judges the extent to which the data and the visual are out of sync.

Nothing could be more out of sync with reality than this image. While it imagines the appropriate number of objects circling the earth, it completely misrepresents the scale of those objects.

Space is unimaginably huge. While there are thousands of objects circling the earth, they range in size from a volleyball to a small school bus. If you do the calculations, the objects in this image range in size from Delaware to Tennessee.

Math Time! (skip if you don’t care)

In this image the diameter of the earth is about 1950 pixels. The real diameter of the earth is about 8000 miles. That means that every pixel is a shade over 4 miles.

The smallest piece of space junk in this image is about 10 pixels wide and 18 pixels tall and the largest one is about 24 pixels wide and 104 pixels tall. That gives the small objects an area of about 3000 square miles (about 30% larger than Delaware) and the large ones an area of 41,000 square miles (a shade smaller than Tennessee).

End of Math Time

To give an example of this exaggeration, let’s look at Angelina Jolie. (How’s that for a non sequitur?) Jolie has a freckle (beauty mark, mole, whatever) above her right eye.


Let’s say we’re concerned about people getting skin cancer, so we want to make a shocking graphic that we hope will help people remember to monitor skin markings for signs of melanoma. If we lied visually as much as the space junk photo, we would change a picture of Angelina Jolie from:




Imagine the Photoshop is done a shade better than I can do. The intention to do good and get people to realize the severity of melanoma is all well and good, but it doesn’t justify lying to people.

Granted, the space junk image holds the disclaimer that it is “an artists impression”. But that isn’t how people read these kinds of things and anyone who believes otherwise is, quite frankly, lying to themselves about the realities of human perception and belief. People see these images and they expect that they match reality in some way. Do a search for “space junk” to find out how many otherwise intelligent people have accepted this image as reality without a breath to admit how inaccurate it is.

This is not to say space junk isn’t a problem. I would have “solved” the problem of visual representation by portraying the space junk as a dot. A single pixel that can clearly indicate position instead of pretending to be a representation of size. Then, I would explain that, even though these objects are very tiny compared to the size of the space they’re in, this junk moves at thousands of miles an hour… making very small objects insanely dangerous.

You could effectively compare it to shooting a bullet into the air. A tiny piece of metal in a huge space can be really dangerous. People get that. There is no reason to portray the bullet as a 747.

I’m worried that even scientific people either didn’t recognize this problem or didn’t feel the need to speak up about it. Even people experienced in infographics didn’t say anything (see here, and here). (Side Note: I take particular pleasure is smacking down Wired magazine for putting up this graphic without even mentioning that it is an “artist rendering”. As a whole, they tend to be smug and irritating in the extent to which they dismiss anyone without technical or scientific expertise. Here they reveal that they are just as susceptible to junk science as the average Joe.)

There is an extent to which many people in scientific and technical journalism are content to give people the appropriate impression (“Space junk is a dangerous problem”) without providing them with the appropriate information. Or, to put the problem simply, they think the end justifies the means.

I take the view that truth in data is the highest importance. I’m frustrated in how lonely it is out here on my high ground.

You may have seen the recent headline “Real US unemployment rate at 16 pct: Fed official. A snippet:

“If one considers the people who would like a job but have stopped looking — so-called discouraged workers — and those who are working fewer hours than they want, the unemployment rate would move from the official 9.4 percent to 16 percent, said Atlanta Fed chief Dennis Lockhart.

UPDATE: Commentor Tom M. takes note that Mr. Lockhart is probably refering to the U6 numbers and this fact was simply not reported appropriately. He says:

When economists, such as myself, talk about the “real” unemployment rate, we are usually referring to the U6 unemployment figure, which is the U3 rate (the published/official rate) plus people that are “part time for economic reasons” among other groups.

If that is the case, it makes most of the rest of what I have to say pretty much void, but I’ll leave it up anyway. Thanks Tom!

A little while back, I called “discouraged workers” the “despair numbers” (basically, they say they want a job, but they aren’t looking for one).

My conclusion was that we’ve always had despair or discouraged workers, so suddenly adding them in now seems like a dishonest tactic to artificially inflate unemployment to some scary level. In good times, we saw unemployment at about 4-5%, so we’re used to thinking about that range as being good. But if you add the “discouraged workers” in those good times, you’re looking at a “good” unemployment rate of about 7-8%.

As for the “wants to work more hours” crowd, I’m open to considering that group in some way, shape or form, but I don’t know how to add them in a way that is honest. Frankly, as a small business owner and contractor, I don’t work as many hours as I would like. But I don’t go around calling myself “unemployed” or even “underemployed”.

If you look at the Bureau of Labor’s stats on part time workers, you can see that the number has jumped about 3 million in the past year. If we add those workers plus the increase in the “discouraged workers” (about 1 million), we get a rate a little over 12%.

But the problem in my mind is that you can’t simply add part time workers to the “unemployed” list to get any kind of meaningful data. Maybe, for the sake of argumentation, you could could cast an involuntary part time worker as half a worker. Then the unemployment rate is a shade over 11%. This is, I think, a not-unreasonable number to use, given that it shaves off the standard number of “discouraged workers” and uses a dampening variable to account for the fact that part-time workers aren’t really “unemployed”, but “underemployed”.

But I could be easily convinced that crunching the numbers in a new and interesting way is basically statistical cheating and we should just use the standard definitions.

Overall, I’m really uncomfortable with the whole “let’s crunch the numbers so the situation look really terrible” methodology because all it does is try to cast the current situation in a bad light by changing the metric. But you can’t use one metric in the good times and another metric in the bad times.

As such, I think the 16% number is really more of a scare tactic than anything else.

You may have seen the Paul Krugman post “How Big is $9 Trillion” in which he attempts to defend the Obama administration’s recent announcement that they expect that their policies will increase the national debt by $9 trillion. His tack is to “explain” that $9 trillion isn’t really all that much when you understand it in context.

it’s being treated as an inconceivable sum, far beyond anything that could possibly be handled. And it isn’t.

What you have to bear in mind is that the economy — and hence the federal tax base — is enormous, too. Right now GDP is around $14 trillion. If economic growth averages 2.5% a year, which has been the norm, and inflation is 2% a year, which is the target (and which the bond market seems to believe), GDP will be around $22 trillion a decade from now. So we’re talking about adding debt that’s equal to around 40% of GDP.

Right now, federal debt is about 50% of GDP. So even if we do run these deficits, federal debt as a share of GDP will be substantially less than it was at the end of World War II.

I defer to Paul Krugman on a lot of things because he is transparently smarter than I am. But it is precisely because of this fact that I know he is conscious of the obvious reasons his analysis is hogwash.

First of all, the national debt in WWII was initiated by an existential threat to the very continuation of our country. Mr. Krugman does not make even attempt to make the case that we have a similar crisis that justifies this kind of debt.

Second, implicit in his observation is the concept that since we did fine after WWII, we’ll do fine now. But the years after WWII saw drastic reductions in the inflation-adjusted debt driven by drastic reductions in spending. Mr. Krugman points to no similar possibility in the post-Obama world.

Third, we have something now that we didn’t have in the 1940’s. Back in the 1945, at the height of the spending that saw our national debt rise so dramatically, entitlement spending and interest on the national debt made up a meager 5% of our total budget.

By the end of President Obama’s term (if he runs two terms) we’ll be looking at a federal budget that is 70% mandatory spending. (I assume for the purposes of consistency that mandatory spending includes interest on the national debt because we don’t really have a choice in not paying it.)

Here’s a quick visual of the difference in the budgets in 1945 and 2016. (Ugly, because I did it fast… I’m on vacation.)

1945 vs 2016

If you look at the 1945 budget with the single question “How are we going to reduce our debt?” you can identify the major problem. It’s the defense budget, which is almost 90% of the budget. Interestingly, reducing the defense budget is exactly what we did in order to reduce the debt, cutting it over 80% in 3 years (it helped that we won the war).

As a contrast, President Obama’s solution to reducing overall spending is… well, I don’t think he really has a plan. His projected budget in 2016 has reduced the defense budget as a percentage of the overall budget from 20% to 14%, but military spending isn’t what is killing us. The president has no plans to reduce mandatory spending whatsoever. In fact, his only change to entitlement spending is to increase it.

My problem with Mr. Krugman’s “How big is $9 trillion?” is that he is aware of all the problems I pointed out. He didn’t explain how much $9 trillion is; he obfuscated it. By comparing the debt load in the heart of a world-shaking war to a debt load that was accumulated in (relative) peacetime, he has misled his readers to the real significance of the data.

(By the way… if you would like to blame the debt load on the Iraq war, you should know that those costs have raised our debt by 5% of the GDP. Comparing this to WWII, which raised our debt by 70% of the GDP, is a pretty weak argument.)

I’ve been pretty quiet recently because 1) I’m on vacation and 2) I’m trying to wrap my head around the health care issue before I talk about it at length.

But today I saw something on that bothered me:
Combined PPO

Here’s the thing, Mr. President. There is such a thing as visual lying. That is when you show a graph and you show the numbers but the two things are not in any way related to one another.

That is the problem here. If someone looks at this graph, they see that the sky is falling because the bars have increased so dramatically. On the left, your team has represented a 30% increase with a graphic that shows a 966% increase. On the right, your team has represented a 63% increase with a graphic that shows a 308% increase.

And are the two sets bars related in any way? You might think so, given that they show up next to each other and are supposed to measure the same thing. But from a data perspective, they are not even remotely close to being right.

It is possible to use graphs and numbers in such a way that is honest. That’s an important part of transparency. So, I fixed your graph for you.

You’re welcome.

UPDATE: In the comments section, James quickly identified the problem… the graph starts the y-axis at 1000 instead of 0. I double checked and it looks like he is spot on. Thanks!

With that in mind, the graph is more of a rookie mistake than a conscious attempt to deceive. I’ve edited my post to reflect that (I left my original comments in so everyone can see what a smart-ass I tend to be).

I love the way Freddie introduced a post on abortion late last year. He titled it “you know what we don’t talk about enough? Abortion“. 

I kind of feel the same way about the level of discussion going on with it. I probably would never have mentioned it at all on this blog if it hadn’t been for the incident on Sunday in which a man shot and killed Dr. George Tiller “one of the nation’s few providers of late-term abortions”.

In the fourth paragraph of the AP article, I came across this line:

“But the doctor’s violent death was the latest in a string of shootings and bombings over two decades directed against abortion clinics doctors and staff.” 

After reading that, I decided to look into the statistics of abortion violence with a view toward perhaps creating a visualization about it.

Sadly, there are few things more skewed than abortion violence statistics. I found this pdf on “Abortion Violence and Disruption Statistics” done by the National Abortion Federation and it is mainly propaganda dressed in numbers. But it looks like their numbers on shootings and bombings are verified by legal authorities, so I assume they are pretty accurate. 

Let’s use those statistics to deal with the “string of shootings and bombings over two decades” that the AP talks about. (In order to give the AP the benefit of the doubt, let’s assume that all the “Attemped Murders” of abortion clinic staff involved shooting of some kind. )

According to the NAF document above, this is that the “string of shootings and bombings” looked like over the last 15 years:


Did you know that this is the first abortion related murder since 1998?

I didn’t.

I was under the impression from the AP that abortion killings were like school shootings… the kind of thing that we tragically see on an ongoing basis. (I thought about a graph comparing school violence to abortion violence, but it seemed kind of apples-to-organges to compare sociopathic, psychotic and suicidal teenagers to politically motivated terrorists.) 

Given the actual data, the characterization of this incident as “the latest in a string of shootings and bombings” is deeply dishonest. It embeds into people’s minds the idea that this is a very common tragedy, like school shootings, hurricanes or gang-related violence. In fact, until I looked at the data very recently, I was under exactly that impression. 

It would be much more accurate to say something along the lines of:

This incident has shattered an eight year lull in anti-abortion related shootings, an activity that spiked to record levels in the 90’s.

UPDATE: Upon re-reading my post I realized that it sounded very dry and unfeeling… very matter-of-fact… when I talked about the recent murder. I hope no one got the impression that I’m wholly unphased by this crime. Nothing could be further from the truth. I hope that the fact that referred to crimes of violence against abortion clinics and the staff as acts of terrorism would indicate how I feel about the topic.

I noticed yesterday that a good number of people are getting worked up because it looks like a large number of the Chysler dealerships that are being closed are heavy Republican donors. (Michelle Malkin does her usual roundup here)

I’m taking the time to try to do something that still seems somewhat lacking… run an actual statistical analysis of the data. I’ll post more when I get some real data, but I did want to put up a couple thoughts early on.

Thought 1: Megan McArdle says that this is likely a red herring. She points out that “Democratic and Republican dealers are unlikely to be found in the same place, and the rural counties that tend to be red are probably less profitable.  I would be less surprised to find out that the administration rescued specific donors from the hit list than to find that they deliberately closed Republican dealerships.”

If there was any behind the scenes work by the Obama administration, saving Obama dealerships seems more likely than spitefully killing Republican ones. And I think that we’ve got a pretty big “if” there to begin with.

Thought 2: All the skeptics to this story are pointing to Nate Silver’s “Car Dealerships are Republican (It’s Called a Control Group, People)“. Unfortunately for them, that post is a load of statistical garbage.

Nate is trying to establish a baseline of Republican-to-Democratic donations against which he can judge the validity of the data coming from the closed dealerships. This is a laudable goal, but I get really frustrated when people use statistical or mathematical terms and they don’t know what those terms mean. I’m starting to understand that people on both sides of the isle use “science-y” or “math-y” words because it makes it look like they’re using science and can therefore be trusted. That’s exactly what is going on here.  

Nate’s investigation does not a control group make for the following reasons:

  • There are really three categories here: Republican donor, Democratic donor, and not a donor. He doesn’t even recognize that the last category might exist.
  • He don’t make any distinction between Chrysler dealerships and other dealerships. Maybe Honda dealerships skew Republican and thereby mess up his “control group”. This is like testing a drug aimed at teenage girls and building a “control group” that includes toddlers, WWII veterans and 40-year-old soccer moms. His data is hopelessly polluted.
  • He assumes that everyone who owns a car dealership will list their occupation as car dealer (or some variant). Where I grew up, Hank Aaron owned a couple car dealerships, but I think it was unlikely he listed his occupation as “car dealer”. (If I got a business card from Hank Aaron, I would want it to say “Hank Aaron – Awesomest Person in the World… and Barry Bonds Can Die in a Ditch”)

Take your pick. I got more.

Thought 3: That fact that Nate Silver’s “analysis” is a load of crap doesn’t make the other analysis better… it just makes him something of an ass for pretending that he’s better than everyone else.

Example 1: Dan Collins says:

Statistics that are available suggest that Chrysler auto dealers donated 76% Republican and 24% Democratic.

Looks like someone else didn’t control for non-donating dealerships. (UPDATE: Dan Collins comments below that this statement was revised, although I still don’t see anyone taking into account non-donors.) 

Example 2: Doug Ross has a post called “Dealergate: Stats demonstrate that Chrysler Dealers likely shuttered on a partisan basis“. Towards the bottom, he has a “What Are The Odds” section in which he notices that one company, RLJ-McLarty-Landers, has six Chrysler dealerships that were not closed and claims that:

The approximate odds of such an occurrence can be calculated

He then proceeds to “calculate” those odds based on the assumption that the dealerships were closed at random.

His odds are meaningless. What is RLJ-McLarty-Landers happens to have remarkable market share? Or excellent customer service?

To posit an imperfect analogy, it’s like me being surprised when all the K-Marts in my area go out of business. So I do a statistical sampling of all local supermarkets and say “Ah-ha! All the Wal-Marts in the area didn’t go out of business… what are the odds of that?” And then I calculate the odds out and claim that there are nefarious plans afoot. (I love that word… afoot. Afoot, afoot, afoot.)

Thought 4: This smells like a conspiracy theory. I hate conspiracy theories. I lean toward believing that people, Republicans and Democrats, conservatives and liberals, are good people who are trying to do what they think is right.

On the other hand, if I had been editor at the Washington Post in the 70’s, I probably would have told Bob Woodward and Carl Bernstein that they were acting like crazy people.

I confess to a heavy skepticism. So I’m running the data as carefully as I can and I’ll post what I find. It might take a couple days, though.  I’m not quite ready to quit my job to chase this story full time.

If you’re looking for what seems to be the best work on this so far, it’s probably at the entertainingly named Chrysler Dealership Campaign Donation Information blog. Based off an extremely quick scan of the information, it looks like Joey Smith (the author) is trying to gather data in a meaningful way.

I’m currently watching two week old episodes of Red Eye with Greg Gutfeld on Hulu. If you like outrageous, off the wall humor in your news, you really can’t do better than this show. While “The Daily Show” and “The Colbert Report” take familiar cable news concepts and parody them, Gutfeld completely deconstructs those concepts. If he wasn’t so libertarian, media professors would call his show a work of surreal genius. The show may not be as consistently funny as some others, but it is far less safe… you never know where they’re going to go and what they’re going to say when they get there.

Anyway… back to the numbers thing. They were talking about Dick Cheney’s interview with Bob Schieffer in which Cheney (in Greg’s words):

…insisted that enhanced interrogation saved a crapload of lives. That’s right, he said ‘crapload’.

OK, he didn’t, but he should have.

They then show the part where Cheney stated that:

“I am convinced, absolutely convinced, that we saved thousands, perhaps hundreds of thousands of lives.”

Now I don’t want to talk about the morality and ethics of enhanced interrogation, a topic about which I can’t even begin to talk intelligently.

But I do know a little something about numbers and I remember that, on 9/11 we were all terrified (or at least I was) when we heard how many people worked in the World Trade Center buildings. The number “50,000” was tossed around a good bit that morning. I was happily surprised when the final toll was drastically revised downward over the several weeks .

Near as I can make it, the only way the Bush administration could have saved “possibly hundreds of thousands” of lives is if they stopped a nuclear attack in a major city. And I’m going to go ahead and say that the burden of proof on them is pretty heavy for something like that.

If you bust six guys drinking beer and talking about nuking LA, you probably didn’t save that many people. If, however, you bust six guys drinking beer and talking about nuking LA… and they have a dozen gas centrifuges in the basement enriching uranium, they’re still miles away from nuking LA, but at least you can make the case that you saved a crapload of lives by busting them.

Take note, I’m not at all against going after potential terrorists. I’m just against using numbers so carelessly that they lose their meaning. The “hundred thousand lives saved” is, as Kevin Godlington stated on the show, lunacy.

As a side note, Kevin Godlington is one of Red Eye’s best contributors. He is a British veteran who provides remarkable insight on the show and also works with military charities to help British and American soldiers deal with combat stress. I’ve had a couple people ask if they could donate to help my pro bono work here. If you’ve ever thought of doing so, donate to Kevin’s charity instead.