I’m teaching myself HTML5 to create a WordPress theme that is friendly to mathematics and using illustrations like plots and maps as part of a blog entry. I’ve started with the WordPress theme framework Toolbox. As you can see, the sight is very plain now.
My first task is to fix the broken math equations that you can see below (the ones enclosed by dollar sign symbols). These are magically turned into mathematical equations by Mathjax. I then plan to try out different fonts starting with Cambria Math.
Unpacking and Euclid’s Geometry
Engaged once more in the bittersweet act of unpacking books after a move, I hit upon Oliver Byrne’s 19th century refashioning of Euclid’s Elements. The first two lines of its preface reading:
The arts and sciences have become so extensive, that to facilitate their acquirement is of as much importance as to extend their boundaries. Illustration, if it does not shorten the time of study, will at least make it more agreeable.
Byrne illustrates his point (pun intended!) with a rendition of Euclid’s Elements that is a visual delight. The first six books (out of thirteen) have all mathematical symbols for angles, triangles, lines turned into icons. A yellow pie wedge represents angle A, say, and the angle it subtends a particular value for that angle. So with yellow, red, and blue colors along with solid or dashed lines he turns all of Euclid’s proofs into a visual set of equations. You can get a taste for it here
If I have time later today, I’ll try to mock up an example using Mathematica.
Forensic science, entropy and the law of being there
You always leave something behind and you always take something from where you were – the law of being there. I learned this law sitting in the stacks of the Johnson annex of the Boston Public Library. I was taking a Summer class on chemistry at Copley Square High School. The gimmick in the course was that it used forensic science to teach chemistry and other basic sciences. So I decided to read more about forensic criminology. One book stated that the fundamental law of the field was that a criminal always leaves something behind and always takes something from the scene of the crime. Its obvious veracity struck me immediately. It seems trivial in retrospect.
But the “law of being there” is not just a trivial observation. It is another manifestation of the second law of thermodynamics – the entropy of a closed system will either stay constant or increase. In the case of forensic science this entropy arrow means that physical processes will mix two bodies of matter. Our shoes have mud from all the puddles we stepped in today. A portion of our footprint may still remain in some of those puddles. Perhaps even a few molecules from the soles of our shoes are to be found there as well.
And why do things mix? Because there are many more ways being mixed than being separate. In a world where Avogadro’s number is roughly $10^{24} \text{mol}^{-1}$, there are going to be vastly many ways of being mixed than staying separate. If Avogadro’s number was something like 10 inverse moles, then we would not need to clean so much!
[Addendum] The comments above came about because I started reading “The Killer of Little Shepherds: A True Crime Story and the Birth of Forensic Science ” by Douglas Starr. I just reached chapter 10 – “Never without a trace” and he mentions the “Locard Exchange Principle” – every contact leaves a trace. This was first enunciated by Edmond Locard. He was one of the many students of the forensic medical pioneer Lacassagne that figures prominently in Starr’s book. I like the pithiness of “every contact leaves a trace”. It seems to be universal in its applicability to not only the physical world (as embodied in the 2nd law for physical processes) but also to our interior lives. So now I can say “the law of being there = Locard’s exchange principle = every contact leaves a trace = entropy increases”.
“Wrong” again
Reading David Freedman’s “Wrong” is making me feel dirty. As a scientist, that is. He has collected a whole menagerie of ways that experts can go wrong. From cops to doctors, business gurus to scientists, he spares no one. The two chapters that are discomforting for me are the “The trouble with scientists, Part I” and “The trouble with scientists, Part II”. He focuses on “surrogate measurements” – the use of one thing to measure another one. We cannot test new drugs on humans, so we test them on mice. We cannot tell what you are thinking, so we take a fMRI scan to see where blood is flowing in your brain. One devastating example of how this can go wrong is the drug Avastin.
Avastin shrinks cancer tumors. That was what it was designed to do, and it does it well. Shrink the tumor, cancer cured. That was the theory, at least. One measurement, tumor size, is a surrogate for another one – cancer prognosis. In this case, a negative surrogate, if you will – decreasing one (tumor size) increases the other one (cancer survival). Trouble is, it decreases tumors but it does not increase survival. And it increases other risk factors (heart attack, bowel perforations).
Freedman then goes on to discuss how the 6 billion a year that is being spent in bioinformatics to link genes to disease has yielded so little tangible medical benefits – an observation recently mentioned in connection with the tenth anniversary of sequencing the human genome. Again, surrogate measurements are to blame: the presence of genetic variants in diseased individuals is taken to measure the likelihood of the disease and is taken itself to be an explanatory cause. You get the disease because you have the defective genes.
I am trying to think how this relates to probability and Bayes’ Theorem. Is it that we assume that if event $b$ follows event $a$ the converse must be true? Or in mathematical terms
$$ p(b | a) \approx p(a | b)$$
This notion of surrogate measurements also occurs in my field, machine learning. One salient example with which I am familiar was mentioned in 2003 at a talk by a NSA director at MITRE. He was discussing the large funding effort given to large vocabulary continuous speech recognizers (the LVCSR program) in the 90′s. NSA scientists were convinced that if we could word spot, we could make better filters for security related communications. Select the telephone calls that mention words like “bomb”, etc., and you get a higher proportion of messages that are related to your security concerns. The LVCSR program (in which I participated during my years at Dragon Systems) had the natural arc of these research initiatives. Rapid gains were observed at the beginning, but eventually all systems saturated to the same level of performance.
The funders wanted to pull the plug when this happened. The program directors responded by carrying out a cheating experiment. They concocted a test set where they had perfect transcriptions of a set of telephone conversations and measured how well they could pick out the threatening ones. To their surprise, word spotting, sucked as a detector of high-importance messages. The ability to solve one problem (speech recognition) was not the answer to the real problem (spot the threatening calls). Even if you had perfect recognition, which we did not at that time, you still would not be able to significantly increase your chance of picking out the communications with high security relevance.
Lest you conclude from this story that the LVCSR program directors should have conducted this cheating experiment at the beginning, before they spent the money, consider the incredible economic benefits that have resulted from our cheap access to very good continuous speech recognizers. To paraphrase something I once heard from Fred Jelinek – research is hard and frequently useless, but the alternative is worse. By the way, the exact quote was “I hate work, but the alternative is worse.”
MathJax and GreaseMonkey
Another geeky trick for you lovers of math on the web. Using GreaseMonkey and a small snippet of JavaScript from the MathJax site, you can render MathML on any site with MathJax.
You can visit a site like Wikipedia and lift the Latex or MathML source of any equation in the articles for instant inclusion in your own documents!
MathJax
I’ve installed the MathJax AJAX script for rendering MathML over at Data Engines. This site is still using the itex2MML, which has been reliable. One of the whistles in MathJax is that it allows you to lift Latex code directly from the post to put into a Latex document. Or you can get the MathML output and drop the equations into Word. Pretty geeky stuff!
Unsupervised Inference
I am now one of the principals in Data Engines Corporation. The blog postings related to my research work will now appear mostly there with occasional cultural and philosophical musings related to them here. This blog will continue as the site for personal reflections on my readings.
The big news for us is that we are finally going public with algorithms that can carry out unsupervised inference in the labeling and ranking tasks. Check out our work at dataengines.com if you are interested in this sort of machine learning geeky stuff.
Scroogenomics
I’m sending out a birthday card with a cash gift. It feels strange to give cash. We seem to attach a “tacky” factor to doing so. Leave it to economist Joel Waldfogel to write a book (“Scroogenomics”) on the 20 to 30 percent waste in value that occurs each Christmas season. He argues that cash is actually the best gift we can give someone in most cases.
The gift recipient knows best what they want and giving cash allows them to get the most value for that expenditure. In other words, spending may or may not create value. When I guess what the other person may like, I open the opportunity for destroying the value of the exchange. Unless I now the recipient extremely well or can anticipate their needs perfectly, my gift may have lower value to them than the money spent on the gift. Giving cash allows the recipient to select what they most value as a gift, hence no gift can do better than cash in creating that maximum value. So if we must spend to give gifts, let’s make it most beneficial by giving cash.
I feel better already!
The Data Deluge: We are better at searching than organizing
We are awash in data. Computers and cheap memory storage make it possible for us to collect 1/4 of a Gigabyte of information for every woman, man and child in the planet. Most of it is unused or unusable. We may be relatively good at searching this data but we suck at organizing it. Google is much better at finding than at organizing the data it processes. Digital pictures lie in our computers unorganized because we do not have the time to label every picture in which grandma appears.
On the usefulness of glorious failures
Jon Stewart’s comments on the usefulness of the Apollo missions:
It was that fateful day in July that we planted the Stars and Stripes in the lunar surface, officially claiming the moon as America’s space Puerto Rico. It was all ours. It was the culmination of a dream. … It took us ten years, astronauts’ lives, billions of dollars, and all we did is hit a f***ing golf ball?
have made me think about how exploration is a good onto itself even when it fails. Indeed, it seems that the unintended consequences or discoveries of such explorations are good enough to justify our urge to carry them out.
I offer as my first example of unintended consequences a single photo — the famous Earthrise captured by the Apollo 8 mission on its fourth orbit around the Moon. As recounted in Richard Poole’s Earthrise: How Man First Saw the Earth it was mission commander Frank Borman that first noticed the Earth rising above the gray moonscape and pointed it out to William Anders, the mission scientific crew member. During the previous three lunar orbits, Anders had been busy photographing the lunar surface for possible landing sites. Anders handed the camera with black and white film he was using to Borman but then proceeded to warn him not take any pictures – “Hey, don’t take that, it’s not scheduled.”. Quickly reversing himself, Anders picked up a roll of color film and took the now iconic photo. The cultural influence of this photo is richly detailed in Poole’s book. I’ll mention one that greatly influenced me: it was used as the front cover image for the Last Whole Earth Catalog. As others have pointed out, it is ironic that not one of the people that envisioned space travel ever thought that its most important result would be the perspective we would gain on our home planet. Norman Cousins stated during a 1975 Congressional hearing on the future of space travel: “what was most significant about the lunar voyage was not that men set foot on the Moon, but that they set eye on the Earth.”
My second example on the unintended consequences of exploration and how failures can sometimes be glorious is Columbus’ Voyage to the Americas. Wishing to reach the East Indies, believing that the Earth was actually half as big as it really is, he hit upon the Americas and to his last days did not realize what he had done. We do not know what we will find when we search out from the safety of our known world, and that is why we should.