Lies, Damned Lies, and Politicians

I like politics; I don’t like all of the lying involved.  If you ask me, I think that there should be “Ethics Committee” investigations into all of the lying.  Sure, tweeting a picture of your junk is probably not the best idea, but neither is lying.  And nearly all politicians are guilty [1].  Fortunately, the St. Petersburg Times started a website call Politifact [2] with the hopes of keeping some of these people honest.  I’m not sure it’s helping.

In any case, I wrote an R script to scrape the data from Politifact so that I could do some analysis.  I only got about as far as the following figure related to some of the Republican candidates and their propensity to lie.  The figure displays the number of statements made by each candidate that can be categorized as “True”, “Mostly True”, …, “False”, or “Pants on Fire” according to Politifact.

What can we take from this?  Well, Michelle Bachmann is a big-time liar — ain’t no denying that.  She’s also a freaking nutjob.  Ron Paul probably lies the least, but nobody seems to care about him in the media. Tim Pawlenty doesn’t lie too much.  Then again, he’s a wuss and dropped out anyway.  Mitt Romney seems pretty good when it comes to speaking the truth; it’s gotta be the Mormon background.  I suspect that he’ll lie a bit more in the upcoming months.  And Rick Perry…well, he’s just bat-shit crazy, so I’ll ignore him.

If I had the time, I would try to randomly select some Republicans and Democrats from both the House and Senate and analyse of statement category (truth through pants on fire) is independent of political party and/or branch of Congress.  If you’re interested in doing this, I would be happy to help you get started.  Have a look at my github repo for this project and give it a go!


[1] – Note that I said ‘nearly all’ because Dennis Kucinich doesn’t have a single statement classified as “Pants on Fire” on Politifact.

[2] – Winner of a Pulitzer Prize in 2009.



Filed under Data Mining, Politics, R, Scraping Data

11 responses to “Lies, Damned Lies, and Politicians

  1. Do you think the issue of “selection bias” might impact the conclusions you’re attempting to draw from the PolitiFact data?

    • Ryan

      Sure, but it’s impossible for Politifact to check every statement made by every politician. Here is a link to Politifact’s “Principles” and, in particular, the section “Choosing claims to check” is particularly relevant:

      I’m assuming that they are simply trying to be an unbiased resource for fact checking. I can’t possibly check whether or not their analysis of a statement was correctly classified either…nor would I want to dig that deep w/out being paid!

      • It’s not impossible for PolitiFact to randomly check Politicians’ claims. And that would validate the conclusions you draw. But, as you say, they probably aim to choose statements according to the criteria they set forth. On the other hand, PolitiFact actively solicits readers’ suggestions over which facts to check.
        The problem with your assumption about PolitiFact’s aims is that even if they aim to serve as an unbiased source bias may remain. The aim, in the end, isn’t relevant without a set of standards that end up producing that aim. And it’s a bit misleading to suggest the conclusion you’re suggesting without covering those bases at least with some sort of caveat. But I’m pleased to find that you at least take the trouble to address the problems with your subsequent comments.

      • Ryan

        Agreed; it’s not impossible. However, they probably didn’t start out with the expectation that they were going to do statistical inference either. They are simply a group of journalists who are trying to do some fact checking. They defined their criteria and try to stick within those rules. I’m not too concerned with a liberal vs conservative bias here; I might be concerned if I was comparing any of these candidates to someone on the left. One could argue that maybe they have a bias against Bachmann and just select her crazy statements in order to make the other candidates look better. However, I doubt a rational person could make that claim with a straight face.

  2. Chris Dzombak

    It looks like your Github repo only contains a project skeleton and the finished figure, not the actual code used to generate it. I’d be interested in playing with it if the code were on there!

  3. love the Rage ElmStreet

  4. Raphael

    I think you either need to set the Y-axis scales to “free” or rescale to percent. Because when you’re trying to compare politicians, you shouldn’t be biased by the number of reviews each one receives.

    • Ryan

      Fair points. Given that the total number of reviews for each candidate is not too dissimilar, I went with the raw counts. If I changed to %age, then you lose that bit of information and it induces another bias in that you might think that, e.g., Newt Gingrich has as many total reviews as Mitt Romney. Similarly for freeing up the y axis. Thanks for the comment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s