### How many votes do you need to win?

An interesting math election problem came to my attention recently:
How many votes do you need to get to win an election in which voters get N votes (and the top N candidates win) and in which there are M people on the ballot?
That was a bit of a mouthful, but the idea is pretty straightforward. We basically want to know how well you need to do to win an election, assuming there are no runoffs.  An example may help.

In Philadelphia, we are mostly a 1-party (Democratic) town and so many elections are basically settled in the primary.  For our City Council, for instance, 7 councilors are selected "At Large" which means that they represent and are elected by the entire city.  However, the At Large members are guaranteed to have 5 members from the majority party (Dems), and 2 from the minority (GOP), via a Plurality At-Large (or "Block Vote") voting scheme.  Dems can cast up to 5 votes, and may not cast two for the same candidate.  The top 5 vote-getters win, regardless of whether anyone has an absolute majority.  As a practical matter, people typically only vote for 2 or 3.

We have lots of elections like this in Philly, including for judges (which have their own complications) Commissioner race (2 slots, 2 votes for the Dems).  And, of course, there's the limiting case where you have 1 vote, and 1 nominee.

So let's say that you're a candidate for Commissioner, or Controller or At Large City Councilmember, and there are a total of, say, 8 people up for the job.  What fraction of the vote do you need to get in order to be one of the winners?

This obviously depends a fair amount on the distribution of the votes.  You might have 8 candidates  for DA, but 1 or 2 of them may be dominant, or the votes may be split more evenly.  We want to get a sense of what does happen rather than what could happen.

There's one limiting case: If you have 2 candidates for 1 slot, you need 50%.  That's just a hard and fast rule.  In general, we could imagine that the more people there are in the race (larger M) the lower the vote fraction  you might need.  After all, if the vote is split up more ways, it's probably easier to win.  Likewise, races with fewer "winners" will require a higher fraction of the vote to win.

In order to answer this phenomenologically, I took the last last 4 Democratic primaries of municipal elections in Philly (going back a decade or so), and looked at how many candidates there are, and compared them to what fraction of votes would have gotten a hypothetical candidate (randomly replacing a real candidate) across the line.  In particular, I looked at:
• District Attorney, Controller (N=1)
• Commissioner (N=2)
• At Large City Council (N=5)
And then modeled the actual results as a series.  As it happens:

$$f=c_0+c_1\cdot (N-1)+c_2\cdot (M-2)+c_3 (N-1)^2+c_4 (M-2)^2+c_5 (N-1)\cdot (M-2)$$

With 12 races, I fit 6 variables, so clearly there's at least some fear of garbage-in garbage out.  Still, I found some pretty good results.  You'll note, by the way, that this fit means that $c_0$ should be 50%, since it corresponds to $N=1$ and $M=2$, and all other terms cancel out.  The variables (for the record or if you'd like to plug in an election of your own) are:

• $c_0=$ 51.0 (hey! Pretty close to 50.  I'm not crazy!)
• $c_1=$-29.1
• $c_2=$-2.9
• $c_3=$4.2
• $c_4=$-0.05
• $c_5=$1.0
Thus, for instance, for a winner-take all race ($N=1$) with $M$ candidates, we'd expect:

$$f\simeq 51-2.9\cdot (M-2)$$

To win a race with 4 candidates (at least from these type of elections), you might expect to need about 45% of the vote.  As expected, more hats in the ring mean you need to get slightly fewer votes to win.

Does it work?  Well, you tell me.  The lines in the curve above are the theoretical results modeled from the data (the color tells you what type of election).  But that simply means that we've fit the data we put in, not that we've found some general rule for the universe.

As a minimal sanity check, I put in some Republican data: the last 3 At Large primaries for City council (in which they get 2 seats).  The results are the 3 red stars in the plot above.  2 of the 3 are very good.  The 2011 results were off by about 8%, which in this range, is substantial.  In a better model, I'd include errorbars, plus maybe something reflecting the incumbency or not for one or more candidates.  For today, however, I figured some rules of thumb would be good enough.