Automated extraction of lexical resources

(Thanks for spectie and jimregan for the input)

Some ideias for (semi-)automatically extracting lexical resources from corpora.

Things we want to extract:

Morphological analysers
Constraint rules (sensible ones)
Bilingual dictionaries
Transfer rules

== Morpholical resource extraction

First, i should state that our main aim will be to extract information about the open categories, and not the closed. While it would be interesting to try and learn everything from scratch, it would probably be counter-productive, if at all possible.

So, we leave stuff like prepositions, pronouns, irregular (very frequent) verbs like to be to be manually constructed, which should be doable. Our focus shall instead be on less frequent, but regular and much more numerous verbs, nouns, adjectives, etc.

Automated extraction of lexical resources

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools