<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A simple example of how to stabilize classification errors in uncertain environments by sacrificing accuracy</title>
	<atom:link href="http://www.corrada.com/blog/2007/10/03/a-simple-example-of-how-to-stabilize-classification-errors-in-uncertain-environments-by-sacrificing-accuracy/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.corrada.com/blog/2007/10/03/a-simple-example-of-how-to-stabilize-classification-errors-in-uncertain-environments-by-sacrificing-accuracy/</link>
	<description>Randomness, entropy, pattern matching, maps, geometry, knots, and scientific readings</description>
	<lastBuildDate>Mon, 26 Apr 2010 20:47:41 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Andres</title>
		<link>http://www.corrada.com/blog/2007/10/03/a-simple-example-of-how-to-stabilize-classification-errors-in-uncertain-environments-by-sacrificing-accuracy/comment-page-1/#comment-40</link>
		<dc:creator>Andres</dc:creator>
		<pubDate>Mon, 08 Oct 2007 12:42:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.corrada.com/blog/?p=13#comment-40</guid>
		<description>
I agree, Will. As you say, the question really is what metric is most relevant to the task at hand. My rhetorical excesses are meant to counteract the obsession with highest accuracy that I encounter everyday at work.

Since I am using random detectors, the area under the ROC curve (AUC) is fixed for all of my examples. As you say, clients want us to reduce the AUC. But not always. I have a funny story about this from when I worked at Dragon Systems.

Dragon participated in a series of speaker identification competitions sponsored by NSA in the 90’s. I was lead developer for a system that won 6 out of 9 testing conditions. By won, I mean we got the lowest AUC of any of the other entries by research powerhouses like IBM and BBN. George Doddington had stated he would give 25 dollars for the ‘best’ system in each testing condition. When he got up to award his prizes during the conference with the competition participants, I started licking my chops over the 150 dollars I was going to get. George then proceeded to award all the prizes to other systems because his criterion was not lowest AUC! He chose the systems that set their operating point at their lowest cost! That is, even though we had fielded a system that could attain the lowest cost for 6 of the 9 conditions, we had not set it to the lowest cost it could get.

The lesson to me: users sometimes want to minimize costs with what they have.

</description>
		<content:encoded><![CDATA[<p>I agree, Will. As you say, the question really is what metric is most relevant to the task at hand. My rhetorical excesses are meant to counteract the obsession with highest accuracy that I encounter everyday at work.</p>
<p>Since I am using random detectors, the area under the ROC curve (AUC) is fixed for all of my examples. As you say, clients want us to reduce the AUC. But not always. I have a funny story about this from when I worked at Dragon Systems.</p>
<p>Dragon participated in a series of speaker identification competitions sponsored by NSA in the 90’s. I was lead developer for a system that won 6 out of 9 testing conditions. By won, I mean we got the lowest AUC of any of the other entries by research powerhouses like IBM and BBN. George Doddington had stated he would give 25 dollars for the ‘best’ system in each testing condition. When he got up to award his prizes during the conference with the competition participants, I started licking my chops over the 150 dollars I was going to get. George then proceeded to award all the prizes to other systems because his criterion was not lowest AUC! He chose the systems that set their operating point at their lowest cost! That is, even though we had fielded a system that could attain the lowest cost for 6 of the 9 conditions, we had not set it to the lowest cost it could get.</p>
<p>The lesson to me: users sometimes want to minimize costs with what they have.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Will Dwinnell</title>
		<link>http://www.corrada.com/blog/2007/10/03/a-simple-example-of-how-to-stabilize-classification-errors-in-uncertain-environments-by-sacrificing-accuracy/comment-page-1/#comment-39</link>
		<dc:creator>Will Dwinnell</dc:creator>
		<pubDate>Wed, 03 Oct 2007 19:50:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.corrada.com/blog/?p=13#comment-39</guid>
		<description>
I think the question at hand is: “Which performance metric is most relevant?” Observation-level accuracy is known to suffer from exactly the peculiarity which you describe. My experience in practice, though, is that many clients want probability outputs (as opposed to class label outputs) and need to maximize class separation (measured by, for instance, AUC).

</description>
		<content:encoded><![CDATA[<p>I think the question at hand is: “Which performance metric is most relevant?” Observation-level accuracy is known to suffer from exactly the peculiarity which you describe. My experience in practice, though, is that many clients want probability outputs (as opposed to class label outputs) and need to maximize class separation (measured by, for instance, AUC).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

