Difference between revisions of "User:Ilnar.salimzyan/GSoC2014/Application"

From Apertium
Jump to navigation Jump to search
Line 9: Line 9:
* read documentation on chunking based-transfer and papers describing other Apertium pairs for distant languages
* read documentation on chunking based-transfer and papers describing other Apertium pairs for distant languages
** <s>[[Chunking]]</s>, <s>[[Chunking: A full example]]</s>, sme-nob paper, eus-eng paper, eng-kaz paper.
** <s>[[Chunking]]</s>, <s>[[Chunking: A full example]]</s>, sme-nob paper, eus-eng paper, eng-kaz paper.
* acceptance tests for an Aperitum MT system are: regression tests on the wiki, corpus test (WER and number of [*@#] errors) and testvoc. Unit testing an Apertium MT system is testing its modules (modes). Figure out how to unit test each module.


[[Category:GSoC_2014_Student_proposals|Ilnar.salimzyan]]
[[Category:GSoC_2014_Student_proposals|Ilnar.salimzyan]]

Revision as of 19:08, 16 April 2014

You can find my proposal for GSoC 2014 here:

Post-application period

  • work on the 'James and Mary' translation
    • get rid of the debugging symbols
    • get the baseline WER
  • get permission to use one of the modern government-funded Tatar-Russian dictionaries under a free license and digitize it or fall back to one of the dictionaries in the public domain and scan that
  • read documentation on chunking based-transfer and papers describing other Apertium pairs for distant languages
  • acceptance tests for an Aperitum MT system are: regression tests on the wiki, corpus test (WER and number of [*@#] errors) and testvoc. Unit testing an Apertium MT system is testing its modules (modes). Figure out how to unit test each module.