Difference between revisions of "User:Fpetkovski"
Jump to navigation
Jump to search
Fpetkovski (talk | contribs) |
|||
Line 1: | Line 1: | ||
⚫ | |||
+ | ==Reports== |
||
⚫ | |||
⚫ | |||
+ | |||
⚫ | |||
+ | |||
+ | ==TODO== |
||
+ | |||
+ | * '''Try generating corpus from monolingual SL corpus:''' |
||
+ | ** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога. |
||
+ | *** Run through lexical transfer <code>-biltrans</code> |
||
+ | *** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code> |
||
+ | *** Run through <code>apertium-lex-learner/irstlm-ranker</code> |
||
+ | ** This will give: |
||
+ | *** SL:TL selection possibilities |
||
+ | *** probabilities from the TL language model for each selection |
||
+ | ** Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest. |
||
+ | ** Look at finding out how to work out what "substantially" should be. |
||
+ | |||
+ | * '''Improve current method:''' |
||
+ | ** Split test corpus in two (dev, test) |
||
+ | ** Rerun the experiments and check with test corpus |
||
+ | ** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched |
||
+ | |||
+ | |||
+ | [[Category:Users|Fpetkovski]] |
Revision as of 12:34, 27 July 2012
Reports
TODO
- Try generating corpus from monolingual SL corpus:
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
- Run through lexical transfer
-biltrans
- Run through
apertium-lex-tools/scripts/biltrans-to-multitrans.py
- Run through
apertium-lex-learner/irstlm-ranker
- Run through lexical transfer
- This will give:
- SL:TL selection possibilities
- probabilities from the TL language model for each selection
- Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
- Look at finding out how to work out what "substantially" should be.
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
- Improve current method:
- Split test corpus in two (dev, test)
- Rerun the experiments and check with test corpus
- Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched