Difference between revisions of "User:Fpetkovski"

Revision as of 19:20, 1 August 2012

Reports

Lexical feature transfer - First report

Lexical feature transfer - Second report

TODO

Try generating corpus from monolingual SL corpus:
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.

*** Run through lexical transfer mk-en-biltrans

- - ~~Run through the rest of the pipeline from apertium-transfer -b onwards~~
  - Run through apertium-lex-learner/irstlm-ranker
- This will give:
  - SL:TL selection possibilities
  - probabilities from the TL language model for each selection
- Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
- Look at finding out how to work out what "substantially" should be.

Improve current method:

** Split test corpus in two (dev, test)

Rerun the experiments and check with test corpus

Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched

Look at combining the 1-feature with the 2-feature model as backoff.

Evaluation
- Try pair bootstrap resampling between best system and default translation for both WER and BLEU.

@@ Line 8: / Line 8: @@
 * '''Try generating corpus from monolingual SL corpus:'''
 ** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
-*** Run through lexical transfer <code>mk-en-biltrans</code>
+<s>*** Run through lexical transfer <code>mk-en-biltrans</code>
 *** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code>
-*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards
+*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards</s>
 *** Run through <code>apertium-lex-learner/irstlm-ranker</code>
 ** This will give:
@@ Line 19: / Line 19: @@
 * '''Improve current method:'''
-** Split test corpus in two (dev, test)
+<s>** Split test corpus in two (dev, test)
 *** Rerun the experiments and check with test corpus
 *** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched
 ** Look at combining the 1-feature with the 2-feature model as backoff.
+</s>
 * '''Evaluation'''
 ** Try pair bootstrap resampling between best system and default translation for both WER and BLEU.

Difference between revisions of "User:Fpetkovski"

Revision as of 19:20, 1 August 2012

Reports

TODO

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools