Archive for the 'Error Theory' Category

Autonomous horizontal correlation length in DEM data

The “Swiss paper” that I discussed in an earlier post solved the problem of vertical precision error estimation. I used a set of difference equations that range over {l,m} where l and m are integers from 1 to the number of observations. The equations look like the difference of simple averages. My purpose in using them is that I would be able to cancel out the true value and be left with the error in each measurement. Surprisingly, the set of independent equations you get from considering all possible equations of this type can be turned into a well-determined linear algebra problem for the entries in a particular sparse covariance matrix. This allowed me to measure the vertical uncertainty in a composite Digital Elevation Model (DEM) without knowing ground truth. I currently interpret this as meaning that I have an estimate of how good my DEM model is. I could still be off by a scale, rotation, and translation.

But vertical uncertainty is only the first of two important numbers for the quality of a map. Another important one is the horizontal resolution. How fine grained are the readings in the map? Another way to capture this resolution is to ask what is the horizontal correlation length — how far apart do two measurements have to be so that they are de-correlated with each other. A way to study this is with the variogram. I had never heard of a ‘variogram’ until a couple of years ago. It essentially measures the spatial correlation of data by taking the average of the spatial difference of a function:
E[(f(x)f(x+L)) 2 ]
One typical behavior of this variogram function is that it starts at zero and rises to a plateau in an exponential fashion:
r*(1 exp(L/λ))
The horizontal correlation length is defined to be where the variogram reaches 67% of its final value. In the exponential rise curve this happens when L=λ.

I have had to put off the calculation of the correlation length until today. I was astounded to get almost text-book like exponential rise curves. The autonomous difference equations work for horizontal correlation lengths also! I found a four postings correlation length for our Twenty-Nine Palms dataset. Howard says this is very good resolution. What surprises me is that this could even be done.

Computation in series/parallel: Estimating the ideal throughput speed in uncertain traffic conditions

One of my favorite games to play when I drive is to try to estimate the ideal throughput speed — the speed at which you can go through traffic without having to change your speed. This involves a couple of behavioral changes: you have to allow more space than normal between you and the car in front, you have to be aware of upcoming red lights, etc. Over the years I have found that some drivers behind me get really annoyed over the amount of space in front of me. They race around and drive as fast as they can to the red light ahead whereupon they have to immediately brake. Sometimes I get to go past these drivers since I approach the green light moving while they have to race me again from a dead stop.

It occurred to me the other day that if the driver behind me noticed that I was doing this, they could, in turn, engage in the same estimation game. Their estimate, however, would be better than mine since I am already smoothing out the flow in front of them. This computational chain could continue behind them, each time getting easier and easier to estimate the ideal throughput speed no matter how uncertain the traffic was in front of us. Is this a computation in series or in parallel? We are doing it at the same time so it is in parallel, but each driver depends on the estimate in front of them so it is in series.

Autonomous estimation of the shape of the landscape

In an earlier post, I asserted that it was possible to get a precise map from photographs that only required a translation and rotation. These operations are not enough. You also need a scale change. This is the well-known relative orientation problem in photogrammetry.

However, the conclusion still remains that it is operationally possible to precisely estimate the shape of the terrain without knowing anything about the actual locations of your imaging sensor. If you couple the photographs with a GPS receiver signal you would then narrow considerably the scale change needed to overlay your constructed map with the actual shape of the landscape. In other words, a lot of GPS measurements would get you close to the true scale needed to overlay the map. Let me explain

One can well imagine a noisy GPS receiver that makes you think that two locations of the aerial camera are closer than they actually are. This would result in the constructed map being at a smaller scale than the real world. Likewise, the GPS readings could make you think that the cameras were farther apart when the photographs were taken. This would inflate your map scale. But in either case, the correct scale would be close to the inferred scale.

Now imagine that you consider more and more photographs with their corresponding GPS reading. Since GPS readings are unbiased (one of their great virtues), it would be extremely unlikely in a probabilistic sense that your inferred scale would be far from the real world scale.

Users can take your map and easily overlay it on another map by performing three operations: translation, rotation, and a small scale change. This is an extremely easy thing to do in comparison to trying to get absolute accuracy from the get go!

The random binary detector with the lowest error cost variation

Last night I did some calculations to see what random detector has the lowest error cost variation. That is, given that the detector will be deployed in an uncertain environment, what random detector should you use to minimize its error cost variation. The math is simple so I will show it all in this post.

As in previous posts, I am restricting myself to the simplest case: a binary classification problem. We have two classes, call them A and B. The variability of the testing environment is codified by the single parameter f A, the frequency of class A. The binary random detector is described by a single parameter also, d A — the classification rate for class A.

The cost of misclassifying class A as class B will be denoted by C AB. The cost of classifying class B as class A by C BA. The error cost or loss function in this simple example is then equal to
L=C ABf A(1 d A)+C BA(1 f A)d A.

What random detector setting d A minimizes Lf A, the loss variation under a changing environment? Calculating the derivative one quickly obtains that zero variation in the cost is attainable when
d A=C ABC AB+C BA.
Note that this is not the most accurate detector setting for a particular environment, f A.

The strategic advantage of stabilizing errors in uncertain environments

This weekend I was talking to a friend from graduate school about how to stabilize errors in uncertain environments by sacrificing accuracy, the subject of the previous two posts. During the conversation, I realized another advantage of stabilizing errors: it makes it easier to modify your detector/classifier in an adversarial situation.

The “arms race” game is a common one in conflict situations. You have some technological advantage over an opponent. The opponent adapts and changes their strategy. This is different from the issue of a noisy environment. Here the signal used by your detector is intentionally modified to circumvent its performance level. The enemy camouflages their tanks better. New strategies are deployed. All of which leads to an increase in the error rate of your detector.

A user confronted with this situation would want to slow down their error rate increase so it can be below their learning rate. Such a user would be willing to sacrifice accuracy for the ability to adapt to an adversary. There is a strategic advantage to stabilizing errors.