Difference between revisions of "Automated extraction of lexical resources"

From Apertium
Jump to navigation Jump to search
(New page: (Thanks for spectie and jimregan for the input) Some ideias for (semi-)automatically extracting lexical resources from corpora. Things we want to extract: # Morphological analysers # Co...)
(No difference)

Revision as of 23:41, 31 March 2009

(Thanks for spectie and jimregan for the input)

Some ideias for (semi-)automatically extracting lexical resources from corpora.

Things we want to extract:

  1. Morphological analysers
  2. Constraint rules (sensible ones)
  3. Bilingual dictionaries
  4. Transfer rules


== Morpholical resource extraction

First, i should state that our main aim will be to extract information about the open categories, and not the closed. While it would be interesting to try and learn everything from scratch, it would probably be counter-productive, if at all possible.

So, we leave stuff like prepositions, pronouns, irregular (very frequent) verbs like to be to be manually constructed, which should be doable. Our focus shall instead be on less frequent, but regular and much more numerous verbs, nouns, adjectives, etc.