Blogging Senate forecasts and results in the WA Senate re-election until officially declared.

Twitter: @AU_Truth_Seeker


Friday 30 August 2013

How to model Senate outcomes

Apologies for the length of this post - it ended up much longer than I thought it would.

There are a number of steps involved in forecasting the candidates to be elected to the Senate. This blog will briefly outline what I have done. Note that "Major parties" as used below is for Labor, the Coalition and the Greens.

Be aware that any estimation of party votes is dependent on polls at any time. These flip around. So the assumed primaries for my estimates will change frequently and the expected outcomes will too.

Everything in this blog is accurate at time of publishing.



Step 1: Total major party percentage

With myriad candidates in the Senate, it is easily shown that the percentage of parties who vote for major parties in the senate is lower than for the House of Representatives. Arguably, this is partially because some voters cannot find or cannot be bothered looking for their preferred party and just vote for any three word slogan close enough to their political viewpoint. For 25-30+ tickets, it is reasonable to assume 85% of all votes are for the major parties. As the number of tickets gets smaller, perhaps up to 90% or more of all votes will be for major parties.
NSW Example
It is appropriate to use 85% major party votes for NSW.

Step 2: Individual major party vote

It is important at this step not to be biased but to base data on actual vote at previous elections and adjust using an appropriate poll tracker. My preferred poll tracker is BludgerTrack. Adjustments should made in two different manners. Firstly, the forecast national swing for each party should be applied to the primary votes from the previous election. This provides the best estimate of the green vote. Secondly, for the Coalition and Labor, as these swings define the national 2PP swing an adjustment needs to be made for the difference between the 2PP national and state swing. Thirdly, the major party votes are summed and then proportionally adjusted to get back to the target from step 1.
NSW Example
In 2010, the Greens polled 10.69%. BludgerTrack forecasts a -2% swing to the Greens. The best estimate of Green NSW Senate vote is 8.69%. (unadjusted for step 1)
In 2010, the Coalition polled 38.95% but is facing a national swing of +2.4% (primary). However, the NSW swing for the Coalition is 0.1% weaker than the national swing, so the 2013 NSW Senate vote needs to be adjusted by -0.1%. Estimate of Coalition vote = 38.95 + 2.4 - 0.1 = 41.25%. (unadjusted for step 1)
In 2010, Labor polled 36.54% but is facing a national swing of -1.2% (primary). However, the NSW swing the Labor vote is 0.1% stronger than the national swing, so the 2013 NSW Senate vote needs to be adjusted by +0.1%. Estimate of Labor vote = 36.54 - 1.2 + 0.1 = 35.44%. (unadjusted for step 1)
Applying the step 1 adjustment we get the following forecasts:
Labor: 35.28
Coalition: 41.07
Green: 8.65

Step 3: Individual minor party vote

Ok, here's where the educated guesses starts. We need to assign the groups into categories. It is damn hard to ensure that it all adds up - the tendency is to think "this party will poll x and this party will poll y..." and you'll then find yourself with about 10% too much vote!
I suggest listing the parties in order and then starting with a low number like 2%, and assigning all parties at least 0.05% for ungrouped independents and no-name parties.
The key source of data for estimates should be the previous election, usually adjusted downwards to allow for the higher number of candidates.
NSW Example
2%: Liberal Democratic Party (inflated by their "Group A" position)
1.5%: CDP, KAP, SXP
1%: Shooters and Fishers, One Nation
0.5%: DLP, Palmer, Stop the Greens, Democrats, Family First, Fishing and Lifestyle, No Carbon Tax, Pirate Party
0.2%: HEMP, Wikileaks, Smokers, Coal Seam Gas, Australian Independents, Drug Law Reform, Republicans
0.1%: Socialist Alliance, Socialist Equality Party
0.05%: 18 other parties

Step 4: Party vote variations

In order to conduct Monte Carlo analysis, you need to specify the degree of variability of each party's vote. For simplicity, I assume a linear variation around the mean as assumed above. The higher the party's vote, the greater the certainty about it. 
Monte Carlo analysis is necessary as the order of candidate elimination is fragile - very little change in a vote may change the order of elimination or election, so variability is required to allow a more likely scenario to be identified. Monte Carlo analysis is standard practice in many financial modelling applications.

I assume the following variations as a proportion of each party's vote
If a party has up to 1%, the variation is +/-50% of its vote
If a party has up to 5%, the variation is +/-30% of its vote
If a party has up to 10%, the variation is +/- 25% of its vote
If a party has up to 20%, the variation is +/- 20% of its vote
If a party exceeds 20%, the variation is +/-15% of its vote.
NSW Example:
Forecast LNP vote is 41%, forecast LNP variation is +/-6.15%
Forecast Socialist Alliance vote is 0.1%, variation is +/-0.05%

Step 5: Normalise to 100%

After applying random variation to all tickets, the sum will no longer add to 100%. It is necessary to proportionally reset the total to 100% by scaling all forecast votes up or down. In most scenarios, this step is minimal and does not result in tickets moving outside the range of variation specified above.

Step 6: Press the button

Wait one minute for 1000 Monte Carlo simulations to be run in Microsoft Excel, based on the preferences in officially registered group ticket votes.

Step 7: Publish the results

With the revised NSW numbers, here are the likelihoods of each partyin a rough left to right basis
Left: expected value 2.96 seats
Labor: 2 seats: 100% likely, 3 seats 15% likely
Greens: 78% likelihood of election
Democrats: 2%
SXP: 0.2%

Right: expected value 3.04 seats
Coalition: 2 seats guaranteed, 3 seats 78% likely
Liberal Democrats: 16% likelihood of election
KAP: 5.5%
Shooters and fishers: 4.3%
One Nation: 0.4%
Building Australia: 0.2%
Stable Population: 0.2%

Comments welcome
1. Can you suggest any improvement to my forecast votes?
2. Can you suggest any improvement to my forecast party variation?
3. Any general comments?
If you wish to suggest any variations to forecast votes or forecast variation, I will be happy to remodel these scenarios as suggested, and publish the results below.





10 comments:

  1. Nice blog.

    The one odd thing... you have 1% for One Nation, despite Pauline Hanson running? And lower once you normalise to 100% I assume. I would have thought she would get at least 2% and maybe more... stealing votes from other micro-rights such as CDP, shooters & DLP.

    It would also be interesting to consider what would happen if there was a swing in either direction between the majors over the next week. If the Liberals get another 1-2% from Labor, does that dramatically change the chance of a 4th right senator, or is it inconsequential?

    ReplyDelete
  2. Hi Muppet,

    You are correct in surmising that any further swing over the coming week will change the outcomes. But note that up until this point I've used a wide range for major party votes - +/-15% (which translates to +/-6% for 40% support).

    As this week goes on, I'll refine my estimates and publish them here and use much narrower major party estimates. But estimates for minor parties will continue to be quite large, due to their unknown nature.

    Regarding Pauline Hanson, I think 1% is reasonable. In my step 1 above, I submit that all minor parties are competing for 15% of the vote. This needs to be enough for KAP, PUP, CDP, SXP, LDP, Shooters, etc. A strong vote for PH will rely on people either trying like crazy to find her on the ballot (she's on the extreme right fringe, lol) or happening to stumble upon her. In the NSW 2011 upper house election there were only 16 groups. Here there are about 45 groups, with 42 of them competing for the same 15% that the minor parties drew in NSW 2011. Also, there has been the emergence of Katter and Palmer, whose voters are ideologically similar to that of Hanson's voters. So, she may well poll >1%, but I doubt she'll get 2%.

    ReplyDelete
  3. This is excellent - I was thinking about doing this myself because I hadn't seen any of the quantification of outcome probabilities... But I wasn't looking forward to writing Matlab code for senate preference flows.

    A couple of comments:
    1. When are you planning to put up a forecast of outcomes for all states?
    2. Have you checked dependence on the number of MC samples? Is 1000 enough to get an ~equilibrium set of outcomes? (Not sure
    3. I can't help but wonder if your method slightly under-predicts potential for upsets from high first-preference votes for new or micro parties... (If I were to code this, I would have assigned the total major party vote as a monte carlo variable, and then do the 'normalise to 100%' calculation to only the minor party first preferences, while also allowing greater total variation in each minor party.).

    ReplyDelete
  4. This is amazing. Is there any chance you would consider open sourcing your model?

    ReplyDelete
  5. @trendy
    Thanks for your comments.
    1. For all 6 states I have provided commentary on Will Bowe's Poll Bludger site - check it out. I will consolidate them here with revised up to date poll data as this week goes on. 1-2 States a day. As I use bludger track data my numbers are jumping around a bit as new poll data is incorporated into the aggregate number.

    2. 1000 is reasonable. I think it's as reasonable as pollsters polling 1000 punters and getting a +/-3% moe. It takes my excel VBA about 1 minute to run 1000 repetitions so it's reasonable.

    3. I think the model is great, but I acknowledge the estimates are imperfect. How can anyone possibly guess how much LDP will poll in the nsw Senate? But we know ALP vote to +/- 2%. You'll be pleased to hear latest iterations of my inputs assume only +/-5% vote variance of ALP and LNP (so E (x)=0.4 --> x~[0.38, 0.42])
    But micro party vote will be much more uncertain +/-50%
    By doing it this way I go 90% of the way to what you're suggesting while still allowing minimal variance for majors. ,



    Follow me on Twitter if you want to know when my next post is @truthseekerAU. I am employed full time in a non political job and have a busy week so will only be doing substantive updates after hours.

    ReplyDelete
  6. With respect to X, I never like anyone using my models (same goes for my day job too). In the limited time I've had to write it I have not been able to neaten it up for publication, and an unaware user can easily accidentally break it. But happy to run scenarios for a given set of inputs if time allows.
    Email address domain is gmail.com and the first part is "theoriginaltruthseeker"

    Or Twitter @truthseekerAU

    ReplyDelete
  7. It's very similar to what I put together for the Tasmanian Senate. Haven't bothered to input and run the others at this stage, because well, others have done that.

    As to the algorithm, assuming you have your distribution & quotas correct, the assumptions seem a reasonable starting point. I've gone with more variance than you did, throughout the range and especially with the unknown micros. If you use +/- 100% for those, you can given them a touch more weighting and get a better seeding.

    Interestingly, simply doing an arithmetic average of the preferences allocated to each party provides a very good indicator as to which minor parties have secured prime position in the snowball stakes. In Tassie, that is Australian Independent Party and Family First, with daylight third. AIP is unlikely to get sufficient first preferences to build a start, leaving FF with the running.



    ReplyDelete
  8. Agree, the simple averages do match up rather well with outcomes regarding micros.

    I have plugged many of my inputs into the ABC calculator and get the same output on each occasion. (Mine doesn't have the same rounding limitations, so is arguably more accurate! :-)

    ReplyDelete
  9. I think you have overestimated the majors and underestimated PUP,KAP,SXP and WKP. All others look around right. I would work at your bottom line threshold for the ALP and LNP adjusting the minor parties accordingly working you way back up. Can not see One Nation polling above 1% but think PUP and possible KAP could (They suck oxygen from parties such as One Nation.. WKP I think now will poll 1.2% +/- 0.4%

    In your South Australian I think you have underestimated Xen who will have a quota in his own right...

    ReplyDelete
    Replies
    1. My revised SA numbers resolve the Xen discrepancy. Although as it's uncertain, I'll be modelling it with a higher variance.

      In Vic, I'm giving WKP 0.45-1.35% - a wide range. But he still gets elected 0/1000 times.

      Delete