Archive for the 'Compressed Sensing' Category

What is more important: precision or accuracy?

I have just uploaded our autonomous precision error estimation Swiss conference paper to the new UMass/Amherst Digital Repository. I work at the Aerial Imaging and Remote Sensing Lab at UMass/Amherst. For years, Howard Schultz has been doubling the number of Digital Elevation Models (DEMs) his Terrest system makes by using the fact that computer stereo matching algorithms are not symmetric in their inputs. The paper above proves mathematically that this is indeed correct and possible if the correlation between DEMs is sparse, i.e. only a few of the DEMs are highly correlated.

We are now writing a proposal to further this work by connecting it with compressed sensing ideas as we briefly hint at the end of the paper above. This has got us thinking about the difference between precision and accuracy in measurement errors. Precision is the width of your error bars, accuracy is where the centroid of the measurement is located relative to the ‘true’ value. So if you had to choose between a precise and an accurate map, which one would you choose?

A precise map is one that captures the shape of the world very well, it has a lot of details. An accurate map is one that tells you where objects are located in the real world. An accurate but imprecise map is located correctly but it is very fuzzy. The landscape looks melted. A precise but inaccurate map is located wrong but has lots of detail — you tried to map a desert patch and the map says you mapped Paris (there is a caveat to this characterization — horizontal correlation — that I don’t want to get into in this post). The practical significance of this difference is enormous. A precise but inaccurate map is just a rigid body transformation and a scale change of the real world (3 + 3 + 1 = 7 unknown parameters) that can be fixed by measuring 3 ground control points (3 + 3 + 3 = 9 measurements). An imprecise but accurate map would require thousands of measurements to recover the detail lost in the fuzzy estimate. Therefore, precise maps are cheaper to make than accurate ones since measuring ground control points is a time consuming expensive task.

Furthermore, for some tasks, precision is all you really care about. For example, scientist Andrea Laliberte at the Jornada Experimental Range in south-central New Mexico is interested in classifying invasive species in the desert. For this task you need precision, not accuracy. Of course, if you wanted to bomb a particular shrub, you would want accuracy. My point is that precision is sometimes good enough and therefore you can sacrifice accuracy. Does this sound familiar?

Compressed sensing, underdetermined linear systems, and autonomous error estimation

I am co-writing a proposal to the National Science Foundation to apply ideas from compressed sensing to the problem of autonomously estimating mapping errors. This has got me thinking about this quote from Dave Donoho’s paper For Most Large Underdetermined Systems of Linear Equations, the minimal l 1 -norm solution is also the sparsest solution.:

Many situations in science and technology call for solutions to undetermined systems of equations, i.e. systems of linear equations with fewer equations than unknowns. Examples in array signal processing, inverse problems, and genomic data analysis all come to mind. However,any sensible person working in such fields would have been taught to agree with the statement: “you have a system of linear equations with fewer equations than unknowns. There are infinitely many solutions.” And indeed, they would have been taught well. However, the intuition imbued by that teaching would be misleading.

On closer inspection, many of the applications ask for sparse solutions of such systems, i.e. solutions with few nonzero elements; the interpretation being that we are sure that ‘relatively few’ of the candidate sources, pixels, or genes are turned ‘on’, we just don’t know a priori which ones those are. Finding sparse solutions to such systems would better match the real underlying situation. It would also in many cases have important practical benefits, i.e. allowing us to install fewer antenna elements, make fewer measurements, store less data, or investigate fewer genes.

We have been taught to throw up our hands whenever a problem is labeled as undeterminate — it has fewer equations than unknowns. But this is too pessimistic. What indeterminacy really means is something more like an iron-clad guarantee. To say that a linear system Ax=y is undeterminate means that we cannot solve for a unique x given any y. In other words, the null space of the linear operator A has a non-zero dimension. But here is the crucial caveat that underlies compressed sensing as stated by Donoho above –most natural signals are sparse. We do not get to see all possible y outputs. The space of all facial images is much smaller than the space of all possible images. This is precisely the point of manifold learning techniques in machine learning. So indeterminacy just means that we do not a 100% guarantee. But in a very practical sense, this does not matter when considering many natural signals. We can recover the input signal in spite of undeterminacy.

Examples of this phenomena are experienced by computer power users all the time. Text compression algorithms, which are lossless, cannot be a 100% percent successful. It can be proven that for some input string the compression will fail: the “compressed” text will be larger in size than the uncompressed one. However, when was the last time someone using gzip saw this happen? It does not happen because natural language text does not span the space of all possible character strings. A thousand monkeys pounding on keyboards produce mostly gibberish. This sparsity of natural language means that we can implement practical text compression algorithms that will compress text with an effective rate of 100%.

Rediscovering the usefulness of Euler’s formula

I have started to work thru Anton Deitmar’s “A First Course in Harmonic Analysis”. This is background reading for my perusal of the Compressed Sensing literature. On the first chapter he wants to prove the convergence in the L 2 -norm of Fourier series for periodic functions. To do this he asserts (offhandedly to my untrained eyes) the equality
k=1 ncos(2 πkx)=sin((2 n+1 )πx)2 sin(πx)1 2 .

It is embarrassing how long it has taken me to prove this on my own. In the end, Euler’s formula,
e iθ=cosθ+isinθ,
ended up being the right thing to use since it turns the proof into checking the equality of polynomials. My proof is inductive. First prove the n=1 case,
cos(2 πx)=sin(3 πx)2 sin(πx)1 2 ,
by rewriting it as,
sin(πx)cos(2 πx)=1 2 (sin(3 πx)sin(πx))
(e iθe iθ2 i)(e i2 θ+e i2 θ2 )=1 2 (e i3 θe i3 θ2 ie iθe iθ2 i),
the demonstration of the equality becomes a polynomial check. Induction to the case n+1 by using the same trick with Euler’s formula leads to the general proof of the formula.