Less Stressful than the Needle

Introducing the awesome new "House Results" gadget

I've made a new "House Results Widget", and, full disclosure, it is making me insanely optimistic for the election this November.  The Results Widget is my version of the infamous "needle."  In essence, it takes in all of the information we have at any given point (right now, just history and polls) and turns it into an estimate for what is going to happen to the US House when the dust settles.  And at this point, I'm predicting a better than 90% chance that the Dems take control.  I'll write a technical post shortly about why that should be, but for now, just enjoy randomizing the universe by pressing the reload button and looking at predictions for your favorite district.

On Election night, the model will constantly refine itself, as new results come in, ultimately converging with the final outcome.  The model itself is very simple, assuming only:
  • As a baseline, most districts are going to be something like their historical trends.  I start with the average of the 2012 and 2016 presidential results in each district, with the national average subtracted to give a "neutral" prediction.
  • Polls of individual Congressional Districts are super useful. I know, I know, "The polls were all wrong in 2016" -- except they weren't*.  My model uses all public polling, and compares how much better or worse a district is performing compared to the baseline. Then I allow the possibility that it's all gone to hell, and that something like 2016 might happen.  For that, I allow the possibility that all polls are off by about 3 points or so in either direction.
  • My impression is that a lot of modelers try to treat every district as a unique flower.  The track record for that is mixed, at best.  Instead, my model is based on only 3 numbers:
    • An incumbency advantage
    • The swing from 2012->2016 (which should capture lots of other information like the rural/suburban divide)
    • A generic bump across the board.
That's it.  I take the polls, find the best fit to the 3 numbers above, and that's the model.  I then simulate a few thousand times (historical data suggests that quality of candidates, and details of a district can cause things to bounce around an ideal model by about 8%).  For each realization, I have a House map.

I can then make predictions in two ways. 1)  A simple "winner take all" prediction (where every district where a candidate has more than 50% is given to them), and 2) A distribution of possibilities (the bell curve in the image at the top), where the outcomes are correlated.

For my polls, I use every B- or better pollster (according to 538) from the last 60 days, and by default exclude internal polls. And what does it predict?
  •  Incumbents have an advantage of 5% (a little lower than in 2016)
  • A "generic ballot" of about 12% (a bit higher than the generic congressional ballot questions on most national polls)**
  • In the WTA model, Dems win about 233 seats (218 are required for a majority).
  • In the simulations, Dems win the House 97% of the time.
  • In the simulations, Dems win a median of 240 seats.
All very promising, though you should recall the models in 2016 which predicted a 99.9% chance of Dems winning the presidency.  You can, if you like, make some assumptions by use of flags in the URL:

  • ?p0=# -- Set the incumbency advantage (ignoring the best fit from the polls)
  • ?p2=$ -- Set the GCB number
  • ?twindow=# -- Set the window (in days) of how far back you want to look at polls  
  • ?sigmaG=# -- set the range of probabilities (in %) of the "all polls could be off by a large number" parameter
So, for instance this uses the same incumbency advantage as in 2016, and assumes that the somewhat more modest generic ballot at 538 (8%) are correct, and predicts only a 62% win probability.  On the other hand this only looks at polls over the last month, but assumes a much wider range in polling misses (5% -- remember, that would be on average. 2016, which was supposed to be a horrible mistake was only ~3% off at the statewide level).  Supposing the polls nail it (1% in the average of errors -- and this also assumes nothing changes between now and election day, not something supported by history), Dems win ~100% of the time.

Point is, I've loaded it with reasonable, but not unimpeachable assumptions.  Play around and give me your thoughts.


* 538 took a look at the polling misses and crunching the numbers, the average state polling error in 2016 was about 3% -- roughly on par with the 2 point swing that other studies found in the last 2 weeks of the 2016 election (thanks, Comey). This is a systematic error,  and we can't predict it.  For all we know, this year's congressional races will produce a systematic error toward the Dems. However, the model assumes that a typical range of systematic errors is about 3 points, and therefore that the 2016 election miss was on the high end. 

** But think about it. What is a generic candidate?  You vote for a specific person in a  specific district. Asking about that is a much better way to get at the real predictive information.

Comments