Difference between revisions of "Apertium-specific conventions for lexc"
Firespeaker (talk | contribs) m (Firespeaker moved page Lexc for apertium to Apertium-specific conventions for lexc) |
Firespeaker (talk | contribs) |
||
Line 20: | Line 20: | ||
=== Features === |
=== Features === |
||
Features are defined with square brackets, e.g. <code>%[%-coop%]</code>. |
Features are defined with square brackets, e.g. <code>%[%-coop%]</code>. |
||
[[Category:Documentation]] |
|||
[[Category:lexc]] |
|||
[[Category:HFST]] |
Revision as of 22:46, 28 September 2014
For Apertium, we use the lexc for certain transducers. There are some apertium-specific conventions we employ, outlined below.
Contents
Preferred format for stem definitions
The preferred format for stem definitions is underlying:surface CLASS ; ! "gloss"
, with optional following conditions, for example:
бул:бу DET-DEM ; ! "this" ! Dir/LR
Conditions
There are a few special conditions we use: ! Dir/LR
, ! Dir/RL
, and ! Use/MT
. These allow us to grep out lines to have different right-to-left and left-to-right transducers, and also have separate MT-specific and vanilla transducers. Otherwise lexc simply interprets these as comments.
Bracketed multi-character symbols
We define certain types of multi-character symbols for various purposes.
Tags
Tags are defined with less-than and greater-than signs, e.g. %<pl%>
.
Archiphonemes
Archiphonemes are defined with curly braces, e.g. %{A%}
.
Features
Features are defined with square brackets, e.g. %[%-coop%]
.