Difference between revisions of "User:Fpetkovski"
Jump to navigation
Jump to search
Fpetkovski (talk | contribs) |
Fpetkovski (talk | contribs) (Replaced content with 'GSOC 2013 Application - Improving the lexical selection module GSOC 2012 Report') |
||
Line 1: | Line 1: | ||
[[GSOC 2013 Application - Improving the lexical selection module]] |
[[GSOC 2013 Application - Improving the lexical selection module]] |
||
⚫ | |||
− | ==Documentation / HOWTO== |
||
− | * [[Corpus based preposition selection - HOWTO]] |
||
− | * [[Building a pseudo-parallel corpus]] |
||
⚫ | |||
− | ==Reports== |
||
− | * [[Lexical feature transfer - First report]] |
||
− | |||
− | * [[Lexical feature transfer - Second report]] |
||
− | |||
− | ==TODO== |
||
− | |||
− | * '''Try generating corpus from monolingual SL corpus:''' |
||
− | ** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога. |
||
− | <s>*** Run through lexical transfer <code>mk-en-biltrans</code> |
||
− | *** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code> |
||
− | *** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards |
||
− | *** Run through <code>apertium-lex-learner/irstlm-ranker</code></s> |
||
− | ** This will give: |
||
− | <s>*** SL:TL selection possibilities |
||
− | *** probabilities from the TL language model for each selection</s> |
||
− | ** Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest. |
||
− | ** Look at finding out how to work out what "substantially" should be. |
||
− | |||
− | * '''Improve current method:''' |
||
− | <s>** Split test corpus in two (dev, test) |
||
− | *** Rerun the experiments and check with test corpus |
||
− | *** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched |
||
− | ** Look at combining the 1-feature with the 2-feature model as backoff. |
||
− | </s> |
||
− | * '''Evaluation''' |
||
− | ** Try pair bootstrap resampling between best system and default translation for both WER and BLEU. |
||
− | |||
− | * ''' Check the bidix entries that were added automatically''' |
||
− | |||
− | [[Category:Users|Fpetkovski]] |