You are describing how to evaluate polling methods. And I agree: you do this by comparing an actual election outcome (eg statewide vote totals) to the results of your polling method.
But I am not talking about polling methods, I am talking about Silver’s win probability. This is some proprietary method takes other people’s polls as input (Silver is not a pollster) and outputs a number, like 28%. There are many possible ways to combine the poll results, giving different win probabilities. How do we evaluate Silver’s method, separately from the polls?
I think the answer is basically the same: we compare it to an actual election outcome. Silver said Trump had a 28% win probability in 2016, which means he should win 28% of the time. The actual election outcome is that Trump won 100% of his 2016 elections. So as best as we can tell, Silver’s win probability was quite inaccurate.
Now, if we could rerun the 2016 election maybe his estimate would look better over multiple trials. But we can’t do that, all we can ever do is compare 28% to 100%.
You are describing how to evaluate polling methods. And I agree: you do this by comparing an actual election outcome (eg statewide vote totals) to the results of your polling method.
But I am not talking about polling methods, I am talking about Silver’s win probability. This is some proprietary method takes other people’s polls as input (Silver is not a pollster) and outputs a number, like 28%. There are many possible ways to combine the poll results, giving different win probabilities. How do we evaluate Silver’s method, separately from the polls?
I think the answer is basically the same: we compare it to an actual election outcome. Silver said Trump had a 28% win probability in 2016, which means he should win 28% of the time. The actual election outcome is that Trump won 100% of his 2016 elections. So as best as we can tell, Silver’s win probability was quite inaccurate.
Now, if we could rerun the 2016 election maybe his estimate would look better over multiple trials. But we can’t do that, all we can ever do is compare 28% to 100%.