Back in December, Howard Schultz and I wrote a paper for the IEEE Computer Vision and Pattern Recognition 2008 conference. The paper, by the way, was rejected by the reviewers (this should form an interesting article at some future time). In the paper we calculate the horizontal decorrelation length for a collection of maps using a variogram technique.
The plots were not coming out right and Howard pointed out that I had not de-meaned the data. In particular, I had not de-meaned the average precision error. That is, the precision error for a model has a mean.
For some reason, I had thought that this mean error could be exactly solved with the precision error equations. It cannot. Given $m$ models, the precision error equations give you $m-1$ equations. So the mean value also has to be recovered with $\ell_1$-minimization.
How should one interpret this mean precision error? In the case of exams, the mean precision error for a question could be interpreted as the hardness or ease for the question. Hard questions will get answered correctly by few people, easy ones will be answered correctly by everyone.
This is what I observe in the introductory Physics class exam I have been analyzing. The hardest questions have the smallest mean error. The easiest ones have the largest mean error.
The correct way of defining the covariance matrix involves substracting this mean. That is, the entries should be of the form $\langle \delta_i \delta_j \rangle – \langle \delta_i \rangle \langle \delta_j \rangle$ and not just plain $\langle \delta_i \delta_j \rangle$ as I have been writing in previous posts.