Apertium-specific conventions for lexc
Preferred format for stem definitions
The preferred format for stem definitions is
underlying:surface CLASS ; ! "gloss", with optional following conditions, for example:
бул:бу DET-DEM ; ! "this" ! Dir/LR
%> as a morpheme boundary indicator in lexc.
There are a few special conditions we use:
! Dir/RL, and
! Use/MT. These allow us to grep out lines to have different right-to-left and left-to-right transducers, and also have separate MT-specific and vanilla transducers. Otherwise lexc simply interprets these as comments.
We also use the comment
! TOCHECK to indicate that a stem needs to be verified for accuracy, spelling, classification, etc.
Bracketed multi-character symbols
We define certain types of multi-character symbols for various purposes.
Tags are defined with less-than and greater-than signs, e.g.
Archiphonemes are defined with curly braces, e.g.
Features are defined with square brackets, e.g.
Syntax highlighting and folding in vim
If you want to have lexc syntax highlighting and/or folding in vim, you can get latest version of lexc vim plugin at this github address. Feel free to fork, add features, and submit pull requests :)
Some other options are listed on the vim page.