Difference between revisions of "User:Fpetkovski"

From Apertium
Jump to navigation Jump to search
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[GSOC 2013 Application - Improving the lexical selection module]]
[[/GSOC 2013 Report]]
[[GSOC 2013 Report]]
==Documentation / HOWTO==
* [[Corpus based preposition selection - HOWTO]]
* [[Building a pseudo-parallel corpus]]


==Reports==
* [[Lexical feature transfer - First report]]


[[/GSOC 2013 Application - Improving the lexical selection module]]
* [[Lexical feature transfer - Second report]]


[[/GSOC 2012 Report]]
==TODO==

* '''Try generating corpus from monolingual SL corpus:'''
** Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.
<s>*** Run through lexical transfer <code>mk-en-biltrans</code>
*** Run through <code>apertium-lex-tools/scripts/biltrans-to-multitrans.py</code>
*** Run through the rest of the pipeline from <code>apertium-transfer -b</code> onwards
*** Run through <code>apertium-lex-learner/irstlm-ranker</code></s>
** This will give:
<s>*** SL:TL selection possibilities
*** probabilities from the TL language model for each selection</s>
** Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
** Look at finding out how to work out what "substantially" should be.

* '''Improve current method:'''
<s>** Split test corpus in two (dev, test)
*** Rerun the experiments and check with test corpus
*** Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched
** Look at combining the 1-feature with the 2-feature model as backoff.
</s>
* '''Evaluation'''
** Try pair bootstrap resampling between best system and default translation for both WER and BLEU.

* ''' Check the bidix entries that were added automatically'''

[[Category:Users|Fpetkovski]]

Latest revision as of 10:59, 30 May 2013