Archive for April, 2008

Not every measurement is perfect

Precision error estimate variance decay
Decay of precision error estimate variance

Just to show that not all questions behave as nicely as question 9 in the previous post, here is the plot for question 6 in the same exam.The fit is not as good as for question 9. This is expected, there is no reason why the precision error should decay with a perfect exponential behavior. Nonetheless, it still shows a similar decay constant — about six questions. Remember to click on the image to get the larger image in a zoom pane.

Minimum number of questions revisited

To show off the installation of FancyZoom (a trick I learned while visiting the excellent Language Log), I present a graph of the percentage variation in the mean square precision error as a function of the number of questions used to compute it. The image looks small but you can now click on it to obtain a zoomed in version. Try it!

Variance of precision error estimates for question 9 as a function of number of questions used.
Question 9 precision error estimate variance

Note how good the fit is to a shifted exponential function of the form:
a+b*exp(c(n q3 )).
The measurements are the small dots at n q={4,6,7,10,12 }. The fitted values are a=0.06 , b=0.2 , and c=0.43 . The variable c is the decay constant for the variability in the estimate. In particular, if you calculate its inverse 1 /c you get the number of questions beyond three that will give you less than 33% variability in the estimate. This turns out to be about 2 questions. So ten or twelve questions should be enough for this group of students.

Once again, this suggests that teachers are asking too many questions in their multiple choice exams.

ICML accepts precision error via L1 minimization paper

Our technical report on how to recover precision error estimates with 1 -minimization has been accepted by the 2008 International Conference on Machine Learning.

The paper originally got three anonymous reviews. Two were positive, one strongly negative. In our response to the reviews, we agreed with the general criticism by the reviewers that one experimental demonstration is not enough. In our precision error papers so far, we have only been using one dataset — aerial photographs from the Twenty-Nine Palms region in California. So we are going to include some results from North Carolina forest data to show that our technique works for all sorts of images.

Readers of previous posts may note that besides maps, the precision error has been recovered for questions in a multiple-choice-quesiton (MCQ) exam. It would be nice to include this in our ICML paper, but the title of the paper is “Autonomous geometric precision error estimation in low-level computer vision tasks” so it seems incongruous to do so.

The paper was submitted in early January. Afterwards, we realized that our precision error technique for elevation errors in maps applies to any set of models that make scalar predictions about multiple entities. We are now working on a draft for a Science magazine article that will combine the examples from maps and exams to illustrate the wide applicability of our technique.

Precision error for parse trees

The precision error equations require that “ground truth” cancel out. It is easy to see what that means for elevations in a map. What does it mean for parse trees in a natural language processing task like sentence parsing?

One way to define distance between trees is to consider the total number of reverse operations that bring them back to a common ancestor. Is that number equal to the number one would get by comparing everything to the “true” parsing? That is, the observed parse prediction’s distance is equal to the true parse distance plus the distance created by the error-transformations.

Substraction makes sense to me in the context of trees: you take everything after the common ancestor. What is addition of parse trees? The union of all edges and vertices. Parse trees are graphs after all.

This addition and subtraction of graphs means that we can use the precisione error equations. Parse trees are added and substracted. In the end, a score is assigned to the difference by counting the number of operations it would take to collapse the resulting graph to disconnected single ancestors.

How do I get a bunch of parsing models to test this idea out?