A Question of Maths
By Driusan (Tue Aug 26, 2008 at 03:10:41 PM EST) (all tags)
With the Canadian and USian election looming, I'm trying to figure out a way to do realtime online polling which:
1. Displays results of the question "Who do you intend to vote for?" and margin of error in realtime, rather than after being analyzed by a statistician and reported on by the media.
2. Let's voters update their vote if they change their mind
3. Isn't possible to link between a user and who they voted for.
The goal is to make it possible for a pundit or interested third party to say "How are the results looking now?" whenever they want, rather than whenever Angus Reid wants. Stats is far from my best math, so I'm not sure if the idea I just came up with will produce statisticly relevant results.

gzt? Help?

Let's say we only allow a voter to update their vote once every Δt. We record a timestamp of when a user last voted.

In the record of the vote/results we don't include the user, but we include another timestamp, truncated to the nearest Δt. The goal is to set Δt as small as possible such that we can't identify who cast a specific ballot with any certainty.

When we do our analysis, we do it based on all votes where any voter's most recent vote has been cast within a Δt of that vote.

Assuming that the distribution of people updating their votes "often" is distributed evenly across all parties, will this produce results that mean anything?

A Question of Maths | 18 comments (18 topical, 0 hidden) | Trackback
"whom" </nt> by johnny (2.00 / 0) #1 Tue Aug 26, 2008 at 04:18:00 PM EST
Something about bells? by Driusan (4.00 / 1) #2 Tue Aug 26, 2008 at 05:42:42 PM EST

--
Vive le Montréal libre.
[ Parent ]
Ask not! by Vulch (4.00 / 1) #10 Wed Aug 27, 2008 at 05:51:45 AM EST

[ Parent ]
Can't see it working by TheophileEscargot (4.00 / 1) #3 Tue Aug 26, 2008 at 08:27:30 PM EST
Even with a telephone or door-to-door poll, they need to weight the results demographically to get meaningful results.

An online poll is going to be skewed towards the young, whereas real voters are skewed towards the old.

Not to mention non-American voters, or what happens with the site is mentioned on DailyKos or LittleGreenFootballs, or automatic vote-rigging...
--
It is unlikely that the good of a snail should reside in its shell: so is it likely that the good of a man should?

also, vote-changing: by gzt (2.00 / 0) #7 Wed Aug 27, 2008 at 04:57:33 AM EST
Well, first, an internet poll is self-selecting. That will make the results unreliable in an unknown way. But adding the ability to switch votes makes it doubly-so, because the people who change are only the ones who care enough to go back to the site and change their vote. More casual changers, meh. Part of the point of polling is that it's a snapshot of a particular moment.

[ Parent ]
Realisticly speaking by Driusan (2.00 / 0) #12 Wed Aug 27, 2008 at 02:12:21 PM EST
Polls are used more for projecting what will happen at the, erm, polls, than to take a snapshot of what things were like at a particular moment. My thought is that since we're assuming the people who care enough to go back and change their vote is evenly distributed across parties, it, statisticly speaking, won't make any difference on the overall results.

--
Vive le Montréal libre.
[ Parent ]
by asking who they would vote for today by gzt (2.00 / 0) #16 Wed Aug 27, 2008 at 02:50:35 PM EST
It's a snapshot. They don't do any spiffy projection into the future, they just look and see, oh, 60% said they'd vote for this guy.

[ Parent ]
Ahem. by Breaker (4.00 / 2) #4 Tue Aug 26, 2008 at 10:44:39 PM EST
Maths.  Short for Mathematics.

Wrong. by komet (4.00 / 2) #8 Wed Aug 27, 2008 at 05:03:35 AM EST
It actually stands for Mathematical Anti Telharsic Harfatum Septomin.

--
<ni> komet: You are functionally illiterate as regards trashy erotica.
[ Parent ]
he's right by gzt (4.00 / 1) #9 Wed Aug 27, 2008 at 05:04:12 AM EST
trust me, i have a degree in maths.

[ Parent ]
Um. by Driusan (2.00 / 0) #14 Wed Aug 27, 2008 at 02:31:35 PM EST
I was referring to both statistics?

I.. yeah, I got nothing to justify that typo.

--
Vive le Montréal libre.

[ Parent ]
Probably don't totally get this, but by herbert (2.00 / 0) #5 Tue Aug 26, 2008 at 11:58:53 PM EST
How about if you store votes against a hash of the user's username and password?  Then you could remove the user's previous vote when they entered a new one, but you couldn't identify a user's vote from their username.

You would still fuzz the timestamps if you wanted to avoid the possibility of correlating them with ISP/server logs.

I suppose this way you can still prove who someone voted for if you can beat their password out of them.  Is this what you're trying to avoid?

You can brute force that easily. by Driusan (2.00 / 0) #11 Wed Aug 27, 2008 at 02:07:18 PM EST
There's only, realisticly, 5 parties in Canada. Two in the states. Those 5 options don't change. That means that if you want to know who person A voted for, you just hash the username/password+vote for all 5 options and it's going to be one of them.

You also need to be able to say say "How many people voted for x?" How do you do that if all you have is a hash of their vote?

--
Vive le Montréal libre.

[ Parent ]
I didn't mean hash the vote itself by herbert (2.00 / 0) #18 Thu Aug 28, 2008 at 12:14:27 AM EST
What I mean is: person votes for the first time.  Get a username and password from them, hash these together to give a hash XYZ, and store a row in your votes table:

XYZ, 03:59pm, Democrat

Later on if they change the vote, put in another row:

XYZ, 04:12pm, Republican

Then if you want the most up to date results, count only the newest vote for each voter.  If you want to know what the results were at 4pm, you can find that too.

And the important bit: the passwords are not kept on the server anywhere.  They are accepted from the user, used to calculate the hash, then discarded.

[ Parent ]
well, the biggest problem is sampling. by gzt (4.00 / 1) #6 Wed Aug 27, 2008 at 02:47:53 AM EST
The benefit of Angus Reid is that it's a fairly random sample, and where it deviates from that, they have fairly good ways of knowing what is up. They can get a large enough sample that it's going to be significant and small enough that they can do it multiple times over the course of weeks without even having to consider ever running into the same person twice. People changing who they vote for is a fairly  unimportant consideration, all things considered.

I'd argue that.. by Driusan (2.00 / 0) #13 Wed Aug 27, 2008 at 02:29:24 PM EST
Over the course of a campaign, people changing who they plan to vote for is a fairly important consideration.

Point taken about the internet being self-selecting in terms of demographics. That might be an insurmountable obstacle, unless I can handwave enough to convince both myself and the political pundits who would be interested in such a system that that's a good thing given that higher income people are more likely to vote in an election, last I heard.

I could probably convince the pundits, but I'm not sure that I can wave my hands vigorously enough to convince myself.

--
Vive le Montréal libre.

[ Parent ]
Well, yes, it's fairly important in one sense by gzt (2.00 / 0) #15 Wed Aug 27, 2008 at 02:47:45 PM EST
Otherwise there'd be no point to repeated polling. But my point is that if you get a representative sample of voters for each, say, weekly poll, who cares if a handle of people from last week's poll changed their minds already when you've got a new poll right here? It's important because otherwise the demographics wouldn't change except for those undecided folks (but most polls don't let you get away with saying undecided, so that's moot). But it's not important because every new sample is representative and you're getting good enough sample sizes to say something with confidence each time.

[ Parent ]
Except by Driusan (2.00 / 0) #17 Wed Aug 27, 2008 at 03:38:15 PM EST
The new poll comes out every week or so, which for anyone actively following the election feels like eternity. Repeat polling is slow. I want to know the answer to the question "as things presently stand, what are the odds that I'll be living in a fascist dictatorship after the vote?" in realtime with some degree of confidence, not the question "How did policy platform x announced at time y affect voter preferences?"

--
Vive le Montréal libre.
[ Parent ]
A Question of Maths | 18 comments (18 topical, 0 hidden) | Trackback