Difference between revisions of "User:Fpetkovski"
Jump to navigation
Jump to search
(→TODO) |
Fpetkovski (talk | contribs) (→TODO) |
||
Line 8: | Line 8: | ||
* '''Try generating corpus from monolingual SL corpus:''' |
* '''Try generating corpus from monolingual SL corpus:''' |
||
** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога. |
** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога. |
||
*** Run through lexical transfer <code>mk-en-biltrans</code> |
<s>*** Run through lexical transfer <code>mk-en-biltrans</code> |
||
*** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code> |
*** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code> |
||
*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards |
*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards</s> |
||
*** Run through <code>apertium-lex-learner/irstlm-ranker</code> |
*** Run through <code>apertium-lex-learner/irstlm-ranker</code> |
||
** This will give: |
** This will give: |
||
Line 19: | Line 19: | ||
* '''Improve current method:''' |
* '''Improve current method:''' |
||
** Split test corpus in two (dev, test) |
<s>** Split test corpus in two (dev, test) |
||
*** Rerun the experiments and check with test corpus |
*** Rerun the experiments and check with test corpus |
||
*** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched |
*** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched |
||
** Look at combining the 1-feature with the 2-feature model as backoff. |
** Look at combining the 1-feature with the 2-feature model as backoff. |
||
</s> |
|||
* '''Evaluation''' |
* '''Evaluation''' |
||
** Try pair bootstrap resampling between best system and default translation for both WER and BLEU. |
** Try pair bootstrap resampling between best system and default translation for both WER and BLEU. |
Revision as of 19:20, 1 August 2012
Reports
TODO
- Try generating corpus from monolingual SL corpus:
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
*** Run through lexical transfer mk-en-biltrans
- Run through
apertium-lex-tools/scripts/biltrans-to-multitrans.py
Run through the rest of the pipeline fromapertium-transfer -b
onwards- Run through
apertium-lex-learner/irstlm-ranker
- Run through
- This will give:
- SL:TL selection possibilities
- probabilities from the TL language model for each selection
- Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
- Look at finding out how to work out what "substantially" should be.
- Improve current method:
** Split test corpus in two (dev, test)
- Rerun the experiments and check with test corpus
- Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched
- Look at combining the 1-feature with the 2-feature model as backoff.
- Evaluation
- Try pair bootstrap resampling between best system and default translation for both WER and BLEU.