Difference between revisions of "User:Fpetkovski"

Revision as of 12:36, 27 July 2012

Try generating corpus from monolingual SL corpus:
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
  - Run through lexical transfer mk-en-biltrans
  - Run through apertium-lex-tools/scripts/biltrans-to-multitrans.py
  - Run through the rest of the pipeline from apertium-transfer -b onwards
  - Run through apertium-lex-learner/irstlm-ranker
- This will give:
  - SL:TL selection possibilities
  - probabilities from the TL language model for each selection
- Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
- Look at finding out how to work out what "substantially" should be.

Improve current method:
- Split test corpus in two (dev, test)
  - Rerun the experiments and check with test corpus
  - Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched
- Look at combining the 1-feature with the 2-feature model as backoff.

@@ Line 10: / Line 10: @@
 *** Run through lexical transfer <code>mk-en-biltrans</code>
 *** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code>
+*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards
 *** Run through <code>apertium-lex-learner/irstlm-ranker</code>
 ** This will give: