Difference between revisions of "User:Fpetkovski/GSOC 2012 Report"

Latest revision as of 11:18, 9 February 2015

Documentation / HOWTO[edit]

Reports[edit]

Lexical feature transfer - First report

Lexical feature transfer - Second report

TODO[edit]

Try generating corpus from monolingual SL corpus:
- Оваа лабавост на регулативите се одразува врз третманот на уапсените корисници на дрога.

*** Run through lexical transfer mk-en-biltrans

- - ~~Run through apertium-lex-learner/irstlm-ranker~~
- This will give:

*** SL:TL selection possibilities

- - ~~probabilities from the TL language model for each selection~~
- Select a subset for training where one translation has a substantially higher proportion of the probability mass than the rest.
- Look at finding out how to work out what "substantially" should be.

Improve current method:

** Split test corpus in two (dev, test)

Rerun the experiments and check with test corpus

Look at dev corpus to see what kind of patterns there are in lines that aren't getting matched

Look at combining the 1-feature with the 2-feature model as backoff.

Evaluation
- Try pair bootstrap resampling between best system and default translation for both WER and BLEU.

Check the bidix entries that were added automatically

@@ Line 2: / Line 2: @@
 * [[Corpus based preposition selection - HOWTO]]
 * [[Building a pseudo-parallel corpus]]
+* [[Ideas for Google Summer of Code/Corpus-based lexicalised feature transfer]]
 ==Reports==

Difference between revisions of "User:Fpetkovski/GSOC 2012 Report"

Latest revision as of 11:18, 9 February 2015

Documentation / HOWTO[edit]

Reports[edit]

TODO[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools