<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Learning from your mistakes &#8212; Some ideas</title>
	<atom:link href="http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/</link>
	<description>Natural language processing blog.</description>
	<lastBuildDate>Tue, 10 Aug 2010 21:54:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Can we use crowd sourcing to improve AtD? &#171; After the Deadline</title>
		<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/#comment-1213</link>
		<dc:creator>Can we use crowd sourcing to improve AtD? &#171; After the Deadline</dc:creator>
		<pubDate>Thu, 27 May 2010 18:02:43 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afterthedeadline.com/?p=312#comment-1213</guid>
		<description>[...] to improve&#160;AtD?  Posted in Uncategorized by rsmudge on May 27, 2010   I&#8217;ve written about learning from AtD use in the past. The main ideas I had back then were to bring more data into AtD&#8217;s corpus and [...]</description>
		<content:encoded><![CDATA[<p>[...] to improve&nbsp;AtD?  Posted in Uncategorized by rsmudge on May 27, 2010   I&#8217;ve written about learning from AtD use in the past. The main ideas I had back then were to bring more data into AtD&#8217;s corpus and [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/#comment-489</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Tue, 01 Dec 2009 03:04:05 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afterthedeadline.com/?p=312#comment-489</guid>
		<description>Human spot-check it.</description>
		<content:encoded><![CDATA[<p>Human spot-check it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rsmudge</title>
		<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/#comment-488</link>
		<dc:creator>rsmudge</dc:creator>
		<pubDate>Tue, 01 Dec 2009 01:38:44 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afterthedeadline.com/?p=312#comment-488</guid>
		<description>The trick is deciding what to do with the information when the system gets it. If we know a rule failed in a certain case, how should we feed this to the system so it benefits? One possibility is to add the sentence with that context into the corpora, that would at least aid the statistical filtering and improve the precision.

I haven&#039;t done any research into mining rules or developing an algorithmic approach to improving the grammar checker recall, but I think this would be an interesting area to work in. 

In case someone wants to pick it up, AtD has two data sets for evaluating the grammar checker. They&#039;re data/tests/grammar_gutenberg.txt and data/tests/grammar_wikipedia.txt.  These two files are merely phrases taken from the Wikipedia common grammar errors list merged with context from the sources in their name. There is also a script bin/testgr.sh that tests the precision and recall of the AtD grammar checker using these sources.

Before bringing user data in, I think using these datasets to measure the effectiveness of such and approach would make sense.</description>
		<content:encoded><![CDATA[<p>The trick is deciding what to do with the information when the system gets it. If we know a rule failed in a certain case, how should we feed this to the system so it benefits? One possibility is to add the sentence with that context into the corpora, that would at least aid the statistical filtering and improve the precision.</p>
<p>I haven&#8217;t done any research into mining rules or developing an algorithmic approach to improving the grammar checker recall, but I think this would be an interesting area to work in. </p>
<p>In case someone wants to pick it up, AtD has two data sets for evaluating the grammar checker. They&#8217;re data/tests/grammar_gutenberg.txt and data/tests/grammar_wikipedia.txt.  These two files are merely phrases taken from the Wikipedia common grammar errors list merged with context from the sources in their name. There is also a script bin/testgr.sh that tests the precision and recall of the AtD grammar checker using these sources.</p>
<p>Before bringing user data in, I think using these datasets to measure the effectiveness of such and approach would make sense.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris</title>
		<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/#comment-487</link>
		<dc:creator>Chris</dc:creator>
		<pubDate>Tue, 01 Dec 2009 01:28:00 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afterthedeadline.com/?p=312#comment-487</guid>
		<description>I&#039;m wondering what could be done to increase the recall or precision of the grammar checker?  It seems if you gave users the ability to tag grammar corrections as &quot;false positive&quot; than you could build up a corpus of sentences where the rules failed.  With this data you could train your rules to improve precision.

Conversely, you could give users the ability to flag grammar mistakes or, more likely, sentences as having one or more grammatical errors.  In other words, you could build up a corpus of &quot;false negatives&quot; allowing you to improve recall.  

What are your thoughts on this?</description>
		<content:encoded><![CDATA[<p>I&#8217;m wondering what could be done to increase the recall or precision of the grammar checker?  It seems if you gave users the ability to tag grammar corrections as &#8220;false positive&#8221; than you could build up a corpus of sentences where the rules failed.  With this data you could train your rules to improve precision.</p>
<p>Conversely, you could give users the ability to flag grammar mistakes or, more likely, sentences as having one or more grammatical errors.  In other words, you could build up a corpus of &#8220;false negatives&#8221; allowing you to improve recall.  </p>
<p>What are your thoughts on this?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/#comment-478</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Mon, 23 Nov 2009 23:05:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afterthedeadline.com/?p=312#comment-478</guid>
		<description>There&#039;s no reason to have everything be 100% algorithmic, a hybrid approach with certain human-edited rules (like for &quot;alot&quot;) is probably the best bet long-term.</description>
		<content:encoded><![CDATA[<p>There&#8217;s no reason to have everything be 100% algorithmic, a hybrid approach with certain human-edited rules (like for &#8220;alot&#8221;) is probably the best bet long-term.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
