Guaranteeing accuracy and precision

Discussions in this blog on precision error estimation via 1 minimization have made it clear that the technique is only good for recovering precision, not accuracy. This post will argue that accuracy can also be guaranteed if all the detectors have a greater than one-half probability of being correct.

By “guaranteed”, I mean that I have a high probability of being correct and that furthermore, I can estimate this probability with high confidence. The idea is not new to me and follows a well-known result: if all detectors have P true>1 /2 then, given enough of them, you can guarantee that simple majority voting will be correct at whatever level you want, for example, ninety-nine per cent of the time.

The idea follows from the following formula

P voting correct= i=n/2 +1 n(n i)p i(1 p) ni,

where p is the probability of each system detecting the right answer (which the formula assumes is the same for all of them. The formula also assumes that the detectors are uncorrelated. This is an important point and one I will return to in future posts. For the moment, I will assume detector independence to make the discussion simpler. The technique I am proposing, however, does not require it.

The probability of simple voting being correct shown in the formula above comes from counting the sequences where more than half the detectors give you the right answer. My main argument is that if you know the detectors are better than one-half individually, you can do quite well given enough of them.

The following plot shows the probability of being correct for five detectors as a function of their individual ability to figure out the correct answer.

Probability of majority voting being correct for five detectors The dashed line shows probability that majority voting gives you the right choice. Note that for p single detector correct<1 /2 , it pays to ditch voting and just use the output of a single detector. Note also, that there is a point of maximum relative gain. For five detectors this occurs around 0.72 which gets you 0.86 probability of being correct.

Future posts will discuss how we can lever this simple idea in situations where we have no ground truth and are therefore incapable of estimating our correctness rate. Hint: it involves turning the autonomous difference equations used for the mapping application of precision error into majority voting equations.

Leave a Reply

Spam protection by WP Captcha-Free