Difference between revisions of "User:Fpetkovski"
Jump to navigation
Jump to search
Fpetkovski (talk | contribs) |
Fpetkovski (talk | contribs) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
[[GSOC 2013 |
[[/GSOC 2013 Report]] |
||
⚫ | |||
==Documentation / HOWTO== |
|||
* [[Corpus based preposition selection - HOWTO]] |
|||
* [[Building a pseudo-parallel corpus]] |
|||
==Reports== |
|||
* [[Lexical feature transfer - First report]] |
|||
[[/GSOC 2013 Application - Improving the lexical selection module]] |
|||
* [[Lexical feature transfer - Second report]] |
|||
⚫ | |||
==TODO== |
|||
* '''Try generating corpus from monolingual SL corpus:''' |
|||
** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога. |
|||
<s>*** Run through lexical transfer <code>mk-en-biltrans</code> |
|||
*** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code> |
|||
*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards |
|||
*** Run through <code>apertium-lex-learner/irstlm-ranker</code></s> |
|||
** This will give: |
|||
<s>*** SL:TL selection possibilities |
|||
*** probabilities from the TL language model for each selection</s> |
|||
** Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest. |
|||
** Look at finding out how to work out what "substantially" should be. |
|||
* '''Improve current method:''' |
|||
<s>** Split test corpus in two (dev, test) |
|||
*** Rerun the experiments and check with test corpus |
|||
*** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched |
|||
** Look at combining the 1-feature with the 2-feature model as backoff. |
|||
</s> |
|||
* '''Evaluation''' |
|||
** Try pair bootstrap resampling between best system and default translation for both WER and BLEU. |
|||
* ''' Check the bidix entries that were added automatically''' |
|||
[[Category:Users|Fpetkovski]] |