Muki987: /* Words and expressions */

2009-05-20T07:56:18Z

Words and expressions

Muki987: New page: ==Status== Lextor exists, but it is not turned on. It was found that using Lextor did not provide an improvement in translation quality above the 'baseline' of just choosing the most frequ...

2009-05-20T07:55:57Z

New page: ==Status== Lextor exists, but it is not turned on. It was found that using Lextor did not provide an improvement in translation quality above the 'baseline' of just choosing the most frequ...

New page

==Status==
Lextor exists, but it is not turned on.
It was found that using Lextor did not provide an improvement in
translation quality above the 'baseline' of just choosing the most
frequent or general translation. This is why it is turned off.
==Approaches==
There are many approaches to lexical selection, we're studying them and
hope to implement something in the future. Although we don't have
anything concrete planned for now.

I think Felipe was thinking of something based on hierarchical decision
lists, but he might be able to offer a more in depth reply.

==Words and expressions==
Since properly translated words and properly
translated expressions are the key to translation
quality, I think, exactly the modules, that handle
word and expression recognition and translation
are the key to final quality.

A human translator has for example the word coach.
The word can mean trainer or a mean, in that we travel.
If in the text context (sentence) where the coach
appears, there are words, that indicate, the speech
is about travelling (horses, motors, trains, way,
and the like), it is likely, we are talking about travel,
if the words indicate sport or working environment
(training, human leading, success, and the like)
we are talking about a trainer.

If it is not possible to find he meaning using just the
sentence, the whole text must be considered.
If all fails, than we must fall back to the
statistically most likely meaning.

In fact the above is the only way, I can imagine,
that works. Therefore I am really curious, how really lextor
is intended to work, using real example words like
coach and real example texts. The better things are
documented, the more the chance, that we get a working
solution. The wiki is a very good mean for documentation,
and I suggest to add more throughout documentation
for word disambiguation with understandable step-by-step
examples.

It is possible, that lextor's way is viable,
it just needs much larger corpora, that we used to
give it for training. Maybe the corpus minimum size
and coverage need to be specified.

You write "run all possible disambiguations".
What does that mean in the case of coach?
Take texts where coach is a trainer, and
remember all words, than take texts, where coach is
a mean to travel and notice all words.
Finally take the text to translate, and
check all words in it (first just the sentence),
to which version it is more similar?

However to the above approach needs aligned large
corporas of both languages. And I can not imagine
anything else, that works. Therefore my
request to clarify such questions.

@@ Line 24: / Line 24: @@
 appears, there are words, that indicate, the speech
 is about travelling (horses, motors, trains, way,
- and the like), it is likely, we are talking about travel,
+and the like), it is likely, we are talking about travel,
 if the words indicate sport or working environment
 (training, human leading, success, and the like)

Talk:Lextor - Revision history

Muki987: /* Words and expressions */

Muki987: New page: ==Status== Lextor exists, but it is not turned on. It was found that using Lextor did not provide an improvement in translation quality above the 'baseline' of just choosing the most frequ...