Difference between revisions of "User talk:Sambit/GSoC proposal 2017: Odia and English"

From Apertium
Jump to navigation Jump to search
(Created page with "==feedback== "Bootstrapped new language pair(odi.eng) with existing eng monodix." – the language code is "ori", no? https://en.wikipedia.org/wiki/ISO_639:ori On pairs with ...")
 
 
(No difference)

Latest revision as of 17:31, 5 April 2017

feedback[edit]

"Bootstrapped new language pair(odi.eng) with existing eng monodix." – the language code is "ori", no? https://en.wikipedia.org/wiki/ISO_639:ori

On pairs with English: We're normally skeptical about it, since English has so much data that corpus-based methods work very well and it's very difficult to get higher quality than Google etc., but in this case things are different since there's no Odia in Google. I'd still mention in the proposal how the Odia morphology+tagger that you're making will also be good groundwork for future translators with Odia to/from related languages (of which there are many to pick from :)).

You haven't mentioned anything about what resources you plan on using – what have you found so far? (grammars, dictionaries, corpora, any existing NLP stuff at all?)