My graduate professor, good friend, and author of The Myth of the Rational Voter Bryan Caplan (corrected regressions here) has been jousting with The Hidden Agenda of the Political Mind author Jason Weeden about the role of self-interest in political preferences. Weeden’s claim is that self-interest reigns supreme, while Caplan finds that group or social interest tends to dominate (at least for most questions; there are exceptions). Yesterday on Twitter, I promised Bryan I’d redo the regressions he posted at Econlog (see the second and third links above), except using LDV models.
For those of you not super duper into fancy-pants statistics, what I’m doing is fiddling with the implied underlying relationship. An OLS regression fits the dependent variable to the independent variables in a linear relationship, sort of like how the angle of the sun and your height jointly determine the length of your shadow. A probit model fits a binary outcome to a normal (Gaussian, if that word rings a bell) distribution. Typically, when we’ve got a binary (yes|no) or categorical (red|yellow|blue) outcome, I favor using probit (for binary) or ordered probit (for categorical) regressions. That way, when I look at the margins, the software package returns conditional probability estimates without any additional fiddling. This feature appeals to someone as naturally lazy as me.
I will admit though, that lazy as I am, I do try to make it a point to clean up my data before I jump in. The GSS is notoriously dirty, so here are some of the manipulations I performed before attempting to reproduce the Caplan results:
- When reconstructing NUMMENBIN and NUMWOMENBIN, I omitted the categories for “dash or slash”; “some, 1+”; “x”; “garbled text”; “several”; “many, lots”; “n.a.”; and “refused.” n=363
- I omitted similar categories for the PARTNERS and PARTNRS5 variables. “1 or more, dk #” is coded as “9”, which would skew results (“more than 100 partners” is coded as 8). n=135
- In the BIBLE variable, one response is “other.” I ran regressions both with and without this category, the tables below have it included. For the probit regressions, I converted it to an indicator variable anyway. n=365
- I had an old variable for CATHOLIC sitting around. My hunch is that even if Catholics aren’t particularly pious, they’ll still respect the prohibition on abortion.
- I used my immigrant category variable too. But I’m still honing my assimilation axe, so it’s pulling double duty here.
Anyway, Bryan used the built-in regression function in the GSS (it’s just a blog debate, after all), so some of these data cleanup decisions will influence the differences you’ll see between our coefficients. Here’s the first one:
Linear regression | Number of obs | = 12170 | |||
F( 8, 12161) | = 85.15 | ||||
Prob > F | = 0.0000 | ||||
R-squared | = 0.0521 | ||||
Root MSE | = .48414 | ||||
ABANY | Coef. | Robust Std. Err. | t | P>t | Beta |
partners | -.0059775 | .0068631 | -0.87 | 0.384 | -.0115607 |
partnrs5 | -.004042 | .0047348 | -0.85 | 0.393 | -.012975 |
nummenbin | -.0518576 | .0029526 | -17.56 | 0.000 | -.219523 |
numwomenbin | -.0392227 | .0030954 | -12.67 | 0.000 | -.1982399 |
age | -.0014184 | .0015366 | -0.92 | 0.356 | -.0474394 |
age2 | .0000202 | .0000151 | 1.34 | 0.181 | .067471 |
year | .0012656 | .0006964 | 1.82 | 0.069 | .0161483 |
sex | -.0019417 | .0152212 | -0.13 | 0.898 | -.0019403 |
_cons | -.7817866 | 1.394001 | -0.56 | 0.575 | . |
As you can see, the signs are pretty much all the same. The big beta (standardized coefficients, useful for comparison) estimates are number of partners-ever. More promiscuous people are more likely to favor fewer restrictions on abortion. Let’s add POLVIEWS.
Linear regression | Number of obs | = 11796 | |||
F( 9, 11786) | = 201.03 | ||||
Prob > F | = 0.0000 | ||||
R-squared | = 0.1109 | ||||
Root MSE | = .46938 | ||||
abany | Coef. | Robust Std. Err. | t | P>t | Beta |
partners | -.0034251 | .0067884 | -0.50 | 0.614 | -.0066103 |
partnrs5 | -.0009183 | .0046737 | -0.20 | 0.844 | -.0029471 |
nummenbin | -.0414392 | .0029118 | -14.23 | 0.000 | -.1758403 |
numwomenbin | -.0345913 | .0030325 | -11.41 | 0.000 | -.1751062 |
age | -.0034264 | .0015312 | -2.24 | 0.025 | -.1140498 |
age2 | .0000352 | .0000151 | 2.33 | 0.020 | .1167815 |
year | .0015212 | .0006864 | 2.22 | 0.027 | .0193799 |
sex | .001876 | .0148176 | 0.13 | 0.899 | .0018739 |
polviews | .0875726 | .0030081 | 29.11 | 0.000 | .2486036 |
_cons | -1.641724 | 1.37364 | -1.20 | 0.232 | . |
It looks like the lifetime partners variable took a bit of a hit, and the political views picked up a lot of the slack. The R-squared went from 0.05 to 0.11. That’s a heck of an improvement for adding just one variable. Cool. The results still broadly track with Caplan’s. Now to add ATTEND and BIBLE.
Linear regression | Number of obs | = 9094 | |||
F( 11, 9082) | = 298.88 | ||||
Prob > F | = 0.0000 | ||||
R-squared | = 0.1993 | ||||
Root MSE | = .44503 | ||||
abany | Coef. | Robust Std. Err. | t | P>t | Beta |
partners | -.006108 | .0073149 | -0.84 | 0.404 | -.0119162 |
partnrs5 | -.0009723 | .0050626 | -0.19 | 0.848 | -.0031223 |
nummenbin | -.025976 | .0031276 | -8.31 | 0.000 | -.1108654 |
numwomenbin | -.0200344 | .0033034 | -6.06 | 0.000 | -.1015676 |
age | -.0060242 | .001662 | -3.62 | 0.000 | -.2017812 |
age2 | .0000525 | .0000164 | 3.20 | 0.001 | .1752903 |
year | .0011088 | .0007756 | 1.43 | 0.153 | .0135621 |
sex | -.0337992 | .0160843 | -2.10 | 0.036 | -.0337808 |
polviews | .0649188 | .0034196 | 18.98 | 0.000 | .1858515 |
attend | .0315904 | .0019421 | 16.27 | 0.000 | .1742591 |
bible | -.1370039 | .0072629 | -18.86 | 0.000 | -.2012948 |
_cons | -.4886072 | 1.553844 | -0.31 | 0.753 | . |
So the remaining winner winner chickens dinner in the Big Beta Horse Race are belief in a literal interpretation of the bible, partisan politics, age, Church attendance, and then the total number of people bedded lifetime. Adding in CATHOLIC and PARTYID (political party affiliation) further reduced the influence of NUM[WO]MENBIN down to about -0.1 and increased R-squared to 0.20. Adding highest degree completed and log income, per Weeden had similar effects, dragging the beta on sex partners down to -0.95 and boosting the R-squared to 0.23. After all this, I’m using 8385 observations.
At any rate, I am able to more-or-less reproduce Caplan’s results, which is encouraging. On to the fun task of digging a little closer to see what the margins are doing. Let me give you the STATA command for the probit regression, since the output table will be large, unwieldy, and perhaps misleading if what you’re used to seeing is OLS output tables.
With probit, if you want to run margin details on incremental or categorical variables, you have to convert them to indicators. So instead of using AGE and AGE2, I will use i.AGE, which converts each observation to its own variable. This is handy because it doesn’t assume a quadratic relationship, but the tradeoff is that it reduces the power of the test; it eats degrees of freedom. Still, we do what we must because we can. Here’s the command (DV converted from {1,2} to {0,1}):
probit abanya i.partners i.partnrs5 i.nummenbin i.numwomenbin i.age year i.sex i.polviews i.attend i.bible i.partyid catholic i.degree loginc if partners!=9&partnrs5!=9&nummenbin!=9& numwomenbin!=9, vce(robust)
You might still want the big numbers, so here you go:
Number of observations = 8384
Wald chi-square(134) = 1918.92
Prob > chi-square = 0.0000
Pseudo R-square = 0.2024
So that’s good. We’re still looking at about 20% of the variation picked up by our variables. Not bad. Let’s look at some of the margins.
POLVIEWS first:
I apologize for the image quality. I’m having a few minor technical difficulties. Anyway, we can see pretty much the Caplan story here: the more strongly conservative a respondent is, the more likely they are to believe that no woman should have an abortion for any reason.
Not pictured: margins by age. The rough 50/50 split is preserved regardless of the age of the respondent.
Also not pictured: Catholic. it’s what you’d expect.
BIBLE:
As expected, folks who believe the Bible to be the Word of God are more likely to favor abortion restrictions.
ATTEND:
Ditto church attendance.
DEGREE:
More college, more willingness to allow abortion (or greater nuance in interpreting the survey question). This effect isn’t quite as strong as for the church variables. Note the Y axis and the relative width of the error bars.
Now for the self-interested hypotheses. The claim is that men and women who sleep around should prefer easier access to abortion. NUMMENBIN and MUMWOMENBIN should slope down if this hypothesis is correct.
NUMMENBIN:
To be fair, if you ignore the “8” bin (sex with over 100 men, lifetime), there does seem to be a bit of a downward slope. Let’s omit that category.
>100 partners omitted:
There we go. That’s better. That’s consistent with the self-interest hypothesis. The marginsplot for NUMWOMENBIN is a lot less pronounced, but if you squint, there’s sort of a general downward slope, even if significance tests would reject category-to-category differences. At any rate, NUMMENBIN appears to support the Weeden claim.
Let’s break it down a little further. The church stuff seemed important, so let’s see how promiscuity and piety interact.
NUMMENBIN and BIBLE
Either way, it looks like, at least for the pious, religion dominates. How about education?
Same basic result. There’s more of a gap between each series than there is along the series. There’s a .18 drop from 1 partner to 21-100 partners and .23 between “less than high school” and “graduate degree.” Yes, if I were writing a proper paper here, I’d do more sophisticated analysis, but the basic horse race ends up supporting the conclusion that education is a stronger determinant of beliefs about abortion than an individual’s promiscuity.
Okay, last one. How do political views interact with promiscuity? This would clutter the heck out of a graph, so I’ll re-code one more variable POLVIEWSA to lump all liberals and all conservatives together.
Same basic idea. The gap between liberal and conservative is plainly obvious, but you have to get over to the >5 partners bins to see much change in opinion.
Okay, I admit it. My curiosity is piqued. I’m going to redo this, but I want to drop the bins and look at the raw numbers. If you’re an econometrician, please refrain from sending me any nasty messages; I know what I’m doing here is wrong. Technically wrong, anyway. Thanks.
Expanded:
Ha ha ha. Wow. Okay, I guess you can go ahead and tell me off if you really want to. That’s just silly. It’s a bit weird to think of folks who lean conservative who have >100 sex partners anyway. Unless you identify libertarian and happen to be a sex worker. I can ~maybe~ see that. Still, not a lot of useful information out in that right tail.
So what have I learned with this little exercise? Well, I can’t conclusively reject the hypothesis that self-interest has nothing to do with respondents’ position on abortion, though I also have to admit that the number-of-sexual partners is perhaps a flaccid proxy for the variable of interest here. What I can say is that compared to other standard measures of group or sociotropic interest, promiscuity carries relatively little explanatory power. If I wanted to give someone an elevator pitch about what motivates voters, I’d probably stick with something like “they believe in what’s good for society for the most part.”
This was fun. I didn’t livetweet it for a change. Anyway, there’s probably still some good fruit in there. But I’m up to like 1500 words, so that’s probably enough for now.
And for those of you wondering why I’m posting this at Sweet Talk instead of Spivonomy or Euvoluntary Exchange? Well, that’s my business, but suffice it to say I have good reasons that make sense to me.
Dictated but not read. 15 Jan 2015.
“Catholic. it’s what you’d expect.”
I’m going to have to use that somewhere.
Just now realized I squandered a perfectly good Spanish Inquisition joke. Dang it.
“Nobody expects the … wait, no, everyone expected that.”
Uh, as i understand the statistics on who is getting abortions, the definition of ‘self-interest’ is pretty off. It’s not the promiscuous single ladies who are getting most abortions, it’s mothers. (http://www.guttmacher.org/in-the-know/characteristics.html) and women in committed relationships (http://www.guttmacher.org/pubs/US-Abortion-Patients.pdf).
(which perhaps you acknowledged already with the ‘flaccid proxy’ comment)
Correct. The variable choice was not mine; I was merely improving upon the econometrics of the discussion.