Difference between revisions of "Apertium-kaz-tat/paper"

From Apertium
Jump to navigation Jump to search
Line 34: Line 34:
* Coverage stuff
* Coverage stuff
** divide corpora into 10 pieces and run coverage for each to get stddev
** divide corpora into 10 pieces and run coverage for each to get stddev

=== Over-all ===
<s>1 2 3 3.1 3.2 3.3</s> 3.4 <s>4 4.1 4.2</s> 4.3 4.4 4.5 5 5.1 6 Acknowledgements <s>References</s>


[[Category:Kazakh and Tatar|*]]
[[Category:Kazakh and Tatar|*]]

Revision as of 05:24, 18 April 2013

We're submitting a paper on apertium-kaz-tat to MT Summit 2013. DEADLINE: APRIL 20.

TODO

Ideal benchmarks:

  • document rules in the rlx with example sentences
  • more like 100-150 (currently ~40) disambiguation rules in -kaz

Ilnar

  • Development corpus (lots and lots of text)
    • Work on increasing coverage (via lexc) and trimmed coverage (via dix) to 90%
    • Work on making sure testvoc passes
    • add rules — disambigation (CG), lexical selection, and transfer.
  • Test corpus (about 10 pages; don't base rules on this text!)
  • Paper
    • Add affiliation to paper
    • Help JNW come up with some more contrastive stuff (see below / FIXME: Ilnars in paper)
    • Find some exemplary bidix entries for figure 2.
    • New example for table 3 (maybe Kazakh equivalent of original sentence)

Fran

  • Delegate out error-rate testing tasks
  • new version of Table 2

JNW

  • Work on last few issues in -tat twol
  • Write up background
  • Contrastive analysis of Kazakh and Tatar
    • phonological differences (a generalised summary, 2 or 3 small specific examples)
    • orthographical differences (a generalised summary, 1 or 2 small specific examples)
    • lexical and morphological differences (2 or 3 specific examples)
    • morphotactic differences (2 or 3 specific examples)
    • syntactic differences (2 or 3 specific examples)
  • Coverage stuff
    • divide corpora into 10 pieces and run coverage for each to get stddev

Over-all

1 2 3 3.1 3.2 3.3 3.4 4 4.1 4.2 4.3 4.4 4.5 5 5.1 6 Acknowledgements References