Difference between revisions of "Task ideas for Google Code-in/Add lexical-select rules"

From Apertium
Jump to navigation Jump to search
(Created page with '# select a language pair that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you …')
 
(regression test)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
# select a language pair that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁).
+
# '''Select a language pair''' that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁).
# Install Apertium locally from the Subversion repository; install the language pair; make sure that it works and/or get[http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out & compile the language pair.
+
# '''Install Apertium''' through your package manager; install the language pair from [https://github.com/apertium/ github]; make sure that it works.
# Using a large enough corpus of the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.), detect cases of inadequate lexical choice, that is, the translation is grammatical but the translation selected for one word is not correct (because the source word is polysemous or has more than one meaning).
+
# Using a large enough corpus of the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.), '''detect cases of inadequate lexical choice''', that is, the translation is grammatical but the translation selected for one word is not correct (because the source word is polysemous or has more than one meaning).
# Add entries to the bilingual dictionary if needed and write 10 lexical selection rules that select the correct translation in the relevant context.
+
# Add entries to the bilingual dictionary if needed and '''write a lexical selection rule''' that select the correct translation in the relevant context. You'll want to write 10 rules in all.
# Compile and test again.
+
# '''Compile and test''' again to make sure your work did what it was supposed to.
  +
# '''Add a regression test''' to the appropriate wiki page, so if you're doing Kyrgyz->English, add a test to [[English_and_Kyrgyz/Transfer_tests]].
# Submit a patch to your mentor (or commit it if you have already gained developer access).
 
  +
# '''Submit a pull request''' on github and provide your mentor with its url.
   
 
[[Category:Tasks for Google Code-in|Add lexical-selection rules]]
 
[[Category:Tasks for Google Code-in|Add lexical-selection rules]]

Latest revision as of 21:39, 15 December 2019

  1. Select a language pair that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁).
  2. Install Apertium through your package manager; install the language pair from github; make sure that it works.
  3. Using a large enough corpus of the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.), detect cases of inadequate lexical choice, that is, the translation is grammatical but the translation selected for one word is not correct (because the source word is polysemous or has more than one meaning).
  4. Add entries to the bilingual dictionary if needed and write a lexical selection rule that select the correct translation in the relevant context. You'll want to write 10 rules in all.
  5. Compile and test again to make sure your work did what it was supposed to.
  6. Add a regression test to the appropriate wiki page, so if you're doing Kyrgyz->English, add a test to English_and_Kyrgyz/Transfer_tests.
  7. Submit a pull request on github and provide your mentor with its url.