Can we use crowd sourcing to improve AtD?
I’ve written about learning from AtD use in the past. The main ideas I had back then were to bring more data into AtD’s corpus and analyze ignored phrases to find gaps in AtD’s dictionary. I put some time into these ideas but the initial results didn’t look too promising, so I backed off.
Recently, the operator of Online Rechtschreibprüfung 24/7 contacted me. His website offers German spell and grammar checking services (with a beta version using AtD). Neat stuff. Being the nice guy that he is, he is also giving back. His users have the option of marking a spelling mistake as false. He has collected this data and made it available to improve the German After the Deadline dictionary.
Here are some ideas:
- Add a “Not Misspelled?” menu item for spelling errors. This could collect a list of words that are candidates to be added to AtD’s dictionary, similar to what Online Rechtschreibprüfung 24/7 does.
- Add a “Not an error?” menu item for grammar and style errors. This could collect the error and the context around it.
- Add a “Better suggestion?” menu item for grammar, style, and misused word errors. Here you could input a better suggestion for an error.
These three things are pretty trivial to do. I’d also like to find a way to let users highlight errors that aren’t caught. I don’t have any ideas for magically learning from these suggestions. Right now I’d have to analyze each of them and develop rules to catch these errors.
Maybe an option to highlight some text, click Suggest an Error, and complete a short survey about what type of error the text contains.
What are your thoughts?