User:Muki987/ambiguity

From Apertium
< User:Muki987
Revision as of 12:04, 20 June 2009 by Francis Tyers (talk | contribs) (New page: ===TYERS: DON'T YOU UNDERSTAND THE WORD PRIVATE?=== ===THIS IS MY PRIVATE PAGE!!! PLEASE DO NO VANDALIZE (EDIT) THIS PAGE=== ===TYERS: DON'T YOU UNDERSTAND THE WORD PRIVATE?=== ===This ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

TYERS: DON'T YOU UNDERSTAND THE WORD PRIVATE?

THIS IS MY PRIVATE PAGE!!! PLEASE DO NO VANDALIZE (EDIT) THIS PAGE

TYERS: DON'T YOU UNDERSTAND THE WORD PRIVATE?

This is not the Fickipedia

TYERS: DON'T YOU UNDERSTAND THE WORD PRIVATE?

MAKE A PAGE IN YOUR ADDRESS AREA, IF YOU ARE INTERESTED IN THE SUBJECT

Word ambiguity, if a word of the same type has two meanings, like race as a sport event and race as a sociological concept is at present not handled in apertium.

Some examples for ambiguous words are:

  • race (sociological race or sporting race)
  • spirit (alcohol or morally)
  • coach (trainer or travelling mean)
  • dull (event or knife)
  • fall (season, human, part of the landscape)

A suggestion for ambiguity handling:

1. dictionary contains all meanings, order: most possible... less possible.

2. we find real meaning be lemmas in the neighbourhood of the word. In case of race_match the words match, bet, run, sport are for example related words. In case of race_sociology the words discrimination, hatred, human are for example related words. Related words are stored as lower case lemmas, and matching must be also done in lemma form.

3. The algorithm looks:

  • found ambiguous word.
  • if in sentence related words, select meaning. (go from most to less probable meaning)
  • if no related words in sentence, select most probable word.

4. Dictionary looks:

  • <word>race__match<word><related_words>match,bet,run,sport</related_words>
  • <word>race__sociology<word><related_words>discrimination,hatred,human</related_words>

5. Obtaining neighbouring words. The words in the same sentence are the first candidates. If any related words found in them, case solved. If not, algorithm checks the words in the previous 10 sentences. These will be weighted, the nearer to our sentence, the higher the weight. If matches found, the match with the highest weight will be considered.

An example text: This sport car is black. Peter has experienced problems of discrimination. The car race is in Ashton.

In that example the algorithm using the listed related words will decide for race_sociology, since discrimination is the nearest matching candidate. The example illustrates, how important is to select the right related words.

Test sentences

The race was interesting.
La carrera interesaba.
The career was of interest.
The problem of discrimination based on race continues in many countries.
El problema de la discriminación basada en la carrera continúa en muchos países.
The problem of the discrimination based on the career continues in many countries.
The coach was very engaged.
El entrenador era muy comprometido.
The trainer was very awkward.
They travelled on a nice coach to Italy.
Viajaron en un entrenador guapo a Italia.
They travelled in a good-looking trainer to Italy.
The spirit of the group was bad.
El espíritu del grupo era mal.
The spirit of the group was badly.
They drank one litre of spirit that evening.
Bebieron un litro de espíritus que anochecer.
They drank one litre of spirits [as in soul/ghosts] that to get dark.
The knife was dull.
I found his answers rather dull.
Eva's fall was, after the snake convinced her.
This fall was not very rainy.
Can you see that fall behind the house?

The ones with "fall" don't really make sense Francis Tyers 23:20, 18 June 2009 (UTC)