Shock Polls aren't really so Shocking

Every now and again (roughly every 5 minutes), twitter will go wild with a headline about some "Shock Poll" that is intended to overturn your very conception of reality.  Most of the time, they're shocking because they're badly done, or because the error-bars are huge, or sometimes because there's nothing wrong – they're just a good old fashioned outlier.  As recent examples, we have claims that
it's a dead heat in the Oregon Governor's race, that the New Jersey Senate race is within a few points, or on the other side, that the democrat is on track to win head-to-head in the Mississippi senate race.  I'm not going to hold my breath.

Often, poll reporting is governed by bad faith.  Take the case of the recent spate of stories about Rasmussen claiming that Trump's support among African Americans had doubled – to 36% – according to Rasmussen (here's USA Today reporting this awful polling credulously).  Given how awful Trump has been on racial issues, it would be surprising, and indeed, it's not true (though in the cause of full disclosure, Harry Enten, one of the few good writers over at CNN found that Trump's approval among African Americans is closer to 12 or 13%, up from about 9). In this case, the reporting is most definitely done in bad faith, as Rasmussen has a definite right-wing agenda, and defying conventional wisdom will get them more views.

Don't look at Numbers Independently

Even when they're conducted in statistically robust ways, shock polls can be constructed in bad faith, with the pollsters knowing full well how the poll is going to come out. Last year, the Post reported on a poll where over 50% of Republicans would favor postponing the 2020 election if Trump asked.  Or, more recently, Ipsos reported that 43% of Republicans would support Trump's ability to shut down newspapers that are critical to him.

That's really bad.  Until you think about it for a moment.  How many of those Republicans really believe that, and how many are just trying to be loyal?  And how different are those results from other odious positions that people might be backed into a corner to believe.

To delve in, we need to go beyond the toplines into the "crosstabs" of a poll, which also allow you to see how the questions are actually stated (protip: Don't believe any poll results that don't allow you to delve into the crosstabs). 

Here is a simple, non-controversial, but totally fabricated poll. I imagine asking two related questions:
  1. Do you approve of Trump? (Y/N)
  2. Do you think the Republicans or Democrats should be in charge of Congress? – the so-called generic congressional ballot questions.
You would expect that most people who say yes to the first would say they'll vote for a Republican, and most No responders will vote Dem.  Here are my fake numbers:

Vote RVote D
Approve of Trump40%0%
Disapprove of Trump 5%55% 

You'll note that all of the possible combinations add up to 100%.  In real surveys, there are some undecideds or not sures, or some other nuanced responses.  As a practical matter, those numbers are either small, or the fact that there are so many undecideds is the story.  For my analysis, I simply apportion them proportionally.

The total Trump approval is about 40%, a pretty realistic number. The Dems also lead in the generic ballot by about 10 points – also realistic.    But there's more to the story.  In this simulation, while all Trump approvers will vote Republican, some small fraction of disapprovers will as well.  On the other hand, no Trump approvers will vote Dem. There's an asymmetry to preferences.

The Bad Faith Matrix

There's a quantitative way of describing all the ways that two groups of responders will respond to a second question, when asked.

Behold! The bad faith matrix:

$N(Q1)$ $(1-P)\left(s+a\right)$$(1-P)\left(1-s-a\right)$

Before you navigate away, let me explain.  Suppose you have a survey with two questions.  #1 is something fundamental identity.  It might be something like, "Do you support Trump or not?" or "Are you a Republican or a Democrat?"  All of the "yeses" to the first question can be put in one pile (group A), representing a fraction, $p$ of the population.  The other pile, the "noes" are group B.

The groups then answer a second question, one which we maybe expect to correlate to the first.  So, for instance, if you ask Trump supporters if they want Republicans to win the House, you largely expect them to say yes, and no for Trump detractors.

But the correlation isn't perfect.  There are two numbers floating around,
  1. $s$ – which tells you how universally more or less popular the second question is than the first, and 
  2. $a$ – which tells you how asymmetrically the two groups are with regards to changing their answer.
In the example above, with the made up poll, $s=4%$ (Republicans are a little more popular than Trump on average) and $a=4%$, a relatively small number, so the first question really is a proxy for the second.

It makes more sense graphically:

The dark grey areas represent  parameters that don't work because they produce a negative probability.  The light grey region is possible, but weird, because it corresponds to more people in group B answering Yes to the second question and vice versa.  It's like a bunch of Trump Approvers saying they're going to vote Dem and vice-versa. 

The lower left and right edges  get to the heart of the matter.  If a survey question produces a result near there, it's entirely possible that they're simply answering in that way to demonstrate their loyalty.  When Trump supporters say that they would favor Trump shutting down newspapers or postponing elections they're being awful – sure – but more relevantly, they're just trying to prove to the pollster that they see through the questions and want to demonstrate, conclusively, that they support the Republicans.  And in that sense, the answer itself doesn't really tell you much.  And, cynically, I think the pollsters know this, and they ask it anyway so that they can get a widely shared result.

So our analysis is going to cut to the heart of the matter: Are people being sincere, or are they just being partisan?

A few legit questions, and a few in bad faith

Consider a few real recent polling results:
  • Quinnipiac (Aug 13, Question 9): Found that 4% of Republicans thought Dems would do a better job with the economy, while 9% of Republicans thought the opposite.  An insane proposition, especially given their tax bill, but that means that $a=13%$ (modest party crossing), and $s=5%$ (Republicans financial stewardship is viewed more positively than Republicans as a whole).  
  • Quinnipiac (Question 11): The reverse seems to be true for healthcare with 13% of Republicans trusting Dems, and only 5% of Dems trusting Republicans (s=-8%, a=18%).  In other words, you can't put too much stock in the absolute numbers, but you can use this to show that, on the whole, healthcare is a better issue for the Dems, and the economy (as unjust as it is – I mean, did you live through the Obama presidency, people?) is a better issue for the Republicans.
Here are some that aren't asked in such good faith:
  • Reuters/Ipsos: (Question 3.6): "The President should have the authority to close news outlets engaged in bad behavior." 43% of Republicans agreed, and 36% disagreed, a net of 7%.  12% of Democrats agreed, and 74% disagreed.
  • Reuters/Ipsos (Question 1.1) "Do you have a favorable or unfavorable opinion of the following? - New York Times" Here, a "Yes" (republican-like response) are any of the unfavorable responses, totally 35%, while a "No" is any of the favorable responses, totally 56%.  You can see how this (and the others) play out in the chart below.
  • Reuters/Ipsos (Question 5.5): "The News Media is the Enemy of the People."
  • Fox News (Question 30): "Do you approve or disapprove of [the Mueller investigation]?" (They give a much wordier description that almost borders on push-polling).
Remember, points toward the left mean that the opinion is unpopular.

We could keep adding more, but you get the point. When the headline reads that 40% of Republicans believe that Democrats should be denied the right to vote, take a breath and think about what it means, besides their awfulness.  For one, 40% of Republicans only represent about ~20% of the population.  But more significantly, it seems that for certain outrageous claims, even they don't believe it.

It's unsurprising that there's an asymmetry as to what people think which is largely governed by ideology.  However, note where those points lie – alone the lower left part of the box.  That's that part where even significant fractions of Republicans know that the Republican position is wrong, but some fraction assert it anyway.

But for now, I simply want you to keep in mind that shock polls that are intended to be shocking really only tell us one thing: there are large swathes of the population who view secondary questions on polls as nothing more than a loyalty test.