Difference between revisions of "Turkic lexicon"

From Apertium
Jump to navigation Jump to search
Line 20: Line 20:
   
 
====Naming continuation lexica====
 
====Naming continuation lexica====
  +
  +
* Continuation lexica will be named in upper case, and may contain letters, numbers and the symbol <code>-</code>.
  +
** Examples: <code>LEXICON N1</code>, <code>LEXICON DET-DEM</code>, <code>LEXICON ADV</code>
   
 
===Stem lexicons===
 
===Stem lexicons===

Revision as of 03:27, 20 April 2012

Some notes on how to go about making a Turkic lexicon for use in Apertium.

Layout

General points:

  • The lexicon will be made in one file, it will have the suffix .lexc
  • The file will be laid out in the following order:
    1. The multicharacter symbols
    2. The Root lexicon, pointing to the stem lexicons
    3. The morphotactics (continuation lexica)
    4. The stem lexicons

Multicharacter symbols

Morphotactics

Naming continuation lexica

  • Continuation lexica will be named in upper case, and may contain letters, numbers and the symbol -.
    • Examples: LEXICON N1, LEXICON DET-DEM, LEXICON ADV

Stem lexicons

Categorisation

Nominals

Finite verbs

Non-finite verbs

Language specific issues