I'm going to make an attempt at predicting the outcome of the US presidential election. This is the first time I've done this, and a lot of what I'm doing is an experiment, as far as I'm concerned.
Instead of reinventing the wheel, as many other sites already aggregate polls, I've decided to aggregate the aggregators. I use the state polling averages from FiveThirtyEight, Princeton Election Consortium, Votamatic, PollyVote and HuffPost Pollster. Some of the sites include third parties; others do not. I use the state averages for the third parties from FiveThirtyEight. For the sites that don't have third party averages, I use that average. The two-party average that they have, I divide out proportionally among the remaining percentages.
So, for example. If PollyVote had a Republican:Democratic vote share of 52:48 for, say, Arizona and FiveThirtyEight had a 10% for Johnson, my model calculates that since there is only 90% of the vote left after Johnson is accounted for, the remaining R:D vote has to be divided among it. So the Republicans get 52% of 90%, or 46.8%. The Democrats then get 43.2%. It isn't perfect, but it allows for averaging of all the sites. And, more importantly, what matters in the model is the margin of the Democrats over the Republicans, not so much the specific percentages.
The model also considers the average variance in the polls since 1980 from the current month to election day to add another level of uncertainty into the model. As the election gets closer, this uncertainty will diminish. The deviation in the averages will come from the inherent variation in the state polls (where available), as well as the variability in the Democratic-Republican margin in each state in the last four elections. Finally, I also add a measure of uncertainty in the polling accuracy based on history. Generally, polls have been within +/- 2% of the final vote share for the winner.
I then run 1000 random simulations with the Democratic margin and the standard deviation in that state based on all the levels of variability, and determine the likelihood of the Democrat winning each state.
In the map below, the 3 red categories are likely Republican, either "Safe Republican" (the darkest red), "Likely Republican" or "Leaning Republican". The yellow states are considered a toss up at this point. The 3 blue categories are "Safe Democrat" (the darkest blue), "Likely Democrat" and "Leaning Democrat".
If you hover your mouse over any of the states it will tell you the state name, number of electoral college votes in that state and the projected probability of the Democrats winning that state.
See the pie chart to see the likely electoral college totals predicted for each party.
Finally, there is a chart showing how the chances for Clinton have changed since she was at the peak of her post-convention bounce in August, as well as the most likely electoral college vote outcome. This is the mode, not the average. That is, in the 1000 simulations, the most frequent electoral college vote total that occurs.
The day before the election, I will post a final prediction for each state including vote percentages, a measure of my certainty in my prediction for each state, and the overall national vote share as well.