One year of After the Deadline
It’s been an exciting year. On 21 Jul 09, I started with Automattic. Matt and I had worked out the deal several weeks earlier. We announced the acquisition of After the Deadline in Sept 09 and also made AtD available on WordPress.com.
I remember I was a little nervous about going live on WordPress.com. AtD is written in my language Sleep. I’ve used Sleep for a lot of things but not for the back-end of a web-scale project before. I was afraid of a memory leak or a freak concurrency issue. Fortunately, neither of these issues came up.
Open Source NLP R&D
Shortly after that, we open sourced the After the Deadline service. This is something that will take time to have its impact, but make no mistake, it’s significant.
Using statistics to provide better proofreading is nothing new. Researchers pursued the topic in the 90s and during the earlier part of the last decade. Production tools are starting to use statistical language models to provide smarter suggestions and even correct harder errors like misused homophones. Microsoft Word 2007 has a contextual spell checker that looks for misused words. Microsoft Research is developing ESL Assistant, a tool that uses a statistical language model to filter incorrect grammar suggestions. There are also new tools like Ginger and Ghotit that use statistical techniques to deliver smarter results for writers with learning disabilities. I believe cheap and powerful hardware, lots of available data, and persistent internet connectivity made these smarter, data driven, writing tools practical for production use. We’re riding the same wave of “now possible”.
I’m excited about After the Deadline’s place in this period of change. After the Deadline is simultaneously a production system and a research system. The code is available for researchers and students to tinker with and learn from. Let’s not forget, this also means that you can run your own AtD server and add AtD to your application.
Recently, this project produced its first academic paper. Sunday (6 Jun 10), I will present After the Deadline at the Workshop on Computational Linguistics and Writing taking place at the 2010 North American Association of Computational Linguistics Human Language Technologies Conference.
After the Deadline went from one to five languages in the past year. We’ve released preliminary support for French, German, Portuguese, and Spanish. We offer contextual spell checking in these languages. We also use our language model to make the Language Tool grammar checker smarter. There is still much work to be done to bring our misused word detection to more languages.
At WordCamp NYC, someone approached me with “I love After the Deadline but I always forget to run it”. He suggested we add a feature to automatically proofread posts on submit. No good idea should get lost, so I posted this to the ideas page. Later, I received an email from Mohammad Jangda, who offered to implement this feature. I first made his patch live on WordPress.com. Without an announcement, 500 people were enabling it each day. Over time, auto-proofread doubled the use of After the Deadline on WordPress.com. This same feature has made it into our other platforms as well.
Our wish is to see AtD help people write better in as many places as possible. We put a lot of effort into making high quality plugins, it’s nice when we get help. Gautam Gupta is a great example of such help. He created After the Deadline for bbPress. He and I release updates around the same time and he usually beats me to the punch. My favorite is when he announced AtD/bbPress with support for French, German, Portuguese, and Spanish before I had an updated WordPress plugin out the door.
As I mentioned in the last paragraph, After the Deadline is now available in a lot more places. We have stable plugins for jQuery and TinyMCE. The AtD Core library has allowed us to reuse the protocol parsing and error highlighting logic in many projects.
We now have After the Deadline for Firefox and Google Chrome. I’m amazed at how well these add-ons work. I didn’t believe they were possible. Mitcho Erlewine took on the initial challenge and worked with us to make After the Deadline for Firefox a reality.
We continue to experiment with other applications too. Who knows where you might see AtD next.
Lots of Proofreading
Last month, our AtD servers processed 3.5 million blog posts, emails, tweets, status updates, and who knows what else.
That’s a lot of proofreading. Not bad for a first year.