User talk:Sambit/GSoC proposal 2017: Odia and English

From Apertium
Jump to navigation Jump to search

feedback[edit]

"Bootstrapped new language pair(odi.eng) with existing eng monodix." – the language code is "ori", no? https://en.wikipedia.org/wiki/ISO_639:ori

On pairs with English: We're normally skeptical about it, since English has so much data that corpus-based methods work very well and it's very difficult to get higher quality than Google etc., but in this case things are different since there's no Odia in Google. I'd still mention in the proposal how the Odia morphology+tagger that you're making will also be good groundwork for future translators with Odia to/from related languages (of which there are many to pick from :)).

You haven't mentioned anything about what resources you plan on using – what have you found so far? (grammars, dictionaries, corpora, any existing NLP stuff at all?)