State and National Polls are Telling Different Stories... but not in the way you'd think.

Biden's head-to-head state polling, relative to national polling. 

There has been a lot of hand-wringing about the possibility that whoever ends up winning the Democratic nomination could somehow win the popular vote by even more than Hillary Clinton did in 2016 (by as much as 4 or 5 million votes), and yet somehow fail to win the Electoral College. But as shark attacks or lightning strikes, is this really something that we should be actively fretting about?

The argument basically goes like this:

Dems are making big gains in the suburbs, and so they might run up their numbers in the northeast, the west coast, and Colorado, all while still losing in places like WI. Most of those high educational attainment states are blue already (so the argument goes), so additional gains don't matter at all in terms of final outcomes. 

From Wikipedia: Map of states percentage of population 25 years old with bachelor's degree or higher in 2009.

But it's also true that that FL, PA, and TX (all which went to Trump in 2016) have relatively high numbers of college graduates, and thus are likely to see average or above-average swings. It's reasonable to suppose that increased Dem performance in the suburbs and with college educated white voters won't just improve the map, but it will scramble the map as well.

But a plausibility argument actually isn't all that informative. In most of the states that matter, we already have polling, which means we don't have to make a handwaving argument about how much increases in college-educated or non college-educated white voters may swing the election. We can ask directly.

Don't read too much into this, but for convenience, let's look at Biden's numbers, since he currently leads in the national primary polls (Note: Not an endorsement).

As a quick reminder, my modeler is pretty simple: At the state level, I generally use polls from the 538 database that they rate as a B or better. I put in a time filter, so that polls older than a month are phased out. For polls that haven't been polled (or have only been polled a long time ago), I use the national polling to "correct" the 2016 result by adding a fixed swing across the board. I then add realistic noise to each state (based on sampling) and then add a possibility that all polls are off systematically.

Here's Biden's current map:

You can choose to believe the model or not, but I say if the election were Trump v. Biden, and it were held tomorrow, based on what we know now, Biden would have a 98% chance of victory, with a median of 394 electoral votes. 

In the 10 most likely swing states (all carried by Trump in 2016):
  • PA (20) +8.9% 
  • MI (16) +7.2% 
  • AZ (11) +6.4% 
  • FL (29) +6.3% 
  • GA (16) +4.3% 
  • NC (15) +2.5% 
  • OH (18) +2.3% 
  • TX (38) +1.5% 
  • WI (10) +1.5% 
  • IA (6) -1.2% 
To put this in perspective, Hillary was 38 Electoral Votes shy of winning the presidency in 2016. PA, MI, and AZ would put Biden over the top (even if Trump somehow manages to flip NH, with only 4 electoral votes).

We can even be a little more rigorous about this. Let's take a look at a plot of Biden and Warren's improvement over Clinton state-by-state (where there's polling) compared to educational attainment (percent holding bachelor's degrees):

The size of the points correlates with the number of voters per state. If you think there's no clear relationship between the educational attainment and improved voting, then you're right. The correlation is almost exactly zero. The only thing you might be able to detect is that, at present, Biden does a bit better, state by state, than Warren does. 

Looking at the same thing on a map, we can plot Biden's relative improvement (red means less than average, blue means more than average) compared to Clinton in 2016:

Wisconsin does, indeed, seem to be giving us some trouble.

The point is that there's no particular reason to suppose that the college/noncollege divide is going to create additional problems above and beyond those highlighted by the 2016 election.

But the bigger issue is that there seems to be a disconnect between state and national polling. Here's a time-series of the big picture numbers of this model, taken over the last 4 months:

You'll note that I made two estimates of the "national trend." One is based on simple national polls. The other (dotted) is based on estimates from only those states where there is state polling and assuming that the weighted swing is correct. It may be early, but there are high-quality polls in states representing 35% of the population. Here's what I find. Biden's national lead over Trump is:
  • 12.6% - National Polling 
  • 6.4% - National Estimate from State Polling 
  • 9.4% - National Estimate from State Polling excluding California 
You see, while there's very little polling in CA, it is very high quality, and it predicts no gains at all from 2016, which means that in some sense (at least based on the data so far) the hand-wringing interpretation that huge gains in CA are going to pad the numbers while still losing the election – seem a little off base.
  1. State polling doesn't really seem consistent with National polling. This may be because of a flaw in one or the other, or the choice of states so far, but as I noted earlier, we're polling a large fraction of the country, so it seems odd for them to be off one another. 
  2. Big, liberal states like California don't seem to be swinging the popular votes. 
All of this is a fairly convoluted way of saying that while, yes, we need to be looking at the state maps, it's foolish to fret on how individual demographics might turn out. We care about the total.