Difference between revisions of "Lttoolbox and lexc"
Jump to navigation
Jump to search
(Created page with 'This page describes some how lttoolbox and HFSTs <code>lexc</code> are similar. ==Terminology== {|class=wikitable ! lttoolbox !! lexc !! Notes |- | Para…') |
|||
(19 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
+ | A <code>.lexc</code> file defines how morphemes in the language are joined together, ''morphotactics''. |
||
⚫ | |||
+ | |||
+ | [[lttoolbox et lexc]] |
||
+ | |||
+ | {{TOCD}} |
||
⚫ | |||
==Terminology== |
==Terminology== |
||
Line 6: | Line 11: | ||
! lttoolbox !! lexc !! Notes |
! lttoolbox !! lexc !! Notes |
||
|- |
|- |
||
− | | Paradigm || Continuation lexicon || |
+ | | Paradigm || Continuation lexicon || So each time you see <code>LEXICON foo</code>, think <code><pardef n="foo"/></code> |
|- |
|- |
||
| Section || Root lexicon || |
| Section || Root lexicon || |
||
Line 13: | Line 18: | ||
|- |
|- |
||
| Right || Down || Corresponds to [[lexical form]] |
| Right || Down || Corresponds to [[lexical form]] |
||
+ | |- |
||
+ | | Symbol || Multichar symbol || Sequences of one or more symbol which are treated as one symbol |
||
|} |
|} |
||
− | |||
==Example== |
==Example== |
||
+ | ===lttoolbox=== |
||
⚫ | |||
+ | <pre> |
||
− | # |
||
+ | <dictionary> |
||
− | ; |
||
+ | <sdefs> |
||
+ | <sdef n="n"/> |
||
+ | <sdef n="pl"/> |
||
+ | <sdef n="sg"/> |
||
+ | </sdefs> |
||
+ | <pardefs> |
||
+ | <pardef n="RegNounInfl"> |
||
+ | <e><p><l/><r><s n="n"/><s n="sg"/></r></p></e> |
||
+ | <e><p><l>s</l><r><s n="n"/><s n="pl"/></r></p></e> |
||
+ | </pardef> |
||
+ | </pardefs> |
||
+ | <section id="Root" type="standard"> |
||
+ | <e lm="cat"><i>cat</i><par n="RegNounInfl"/></e> <!-- A noun --> |
||
+ | </section> |
||
+ | </dictionary> |
||
+ | </pre> |
||
+ | |||
+ | And to compile and use this dictionary: |
||
+ | |||
+ | <pre> |
||
+ | $ lt-comp lr test.dix test.bin |
||
+ | Root@standard 7 7 |
||
+ | |||
+ | $ echo "cat" | lt-proc test.bin |
||
+ | ^cat/cat<n><sg>$ |
||
+ | |||
+ | $ echo "cats" | lt-proc test.bin |
||
+ | ^cats/cat<n><pl>$ |
||
+ | </pre> |
||
+ | |||
+ | ===lexc=== |
||
+ | ''See also: [[Apertium-specific conventions for lexc]]'' |
||
+ | |||
+ | <pre> |
||
+ | Multichar_Symbols |
||
+ | |||
+ | %<n%> |
||
+ | %<pl%> |
||
+ | %<sg%> |
||
+ | |||
+ | LEXICON Root |
||
+ | |||
+ | NounRoot ; |
||
LEXICON NounRoot |
LEXICON NounRoot |
||
+ | cat RegNounInfl ; ! A noun |
||
+ | |||
⚫ | |||
+ | |||
+ | %<n%>%<sg%>: # ; |
||
+ | %<n%>%<pl%>:s # ; |
||
+ | </pre> |
||
+ | |||
+ | And to compile and use this dictionary: |
||
+ | |||
+ | <pre> |
||
+ | $ hfst-lexc test.lexc -o test.gen.hfst |
||
+ | $ hfst-invert -i test.gen.hfst -o test.mor.hfst |
||
+ | |||
+ | $ echo "cat" | hfst-lookup test.mor.hfst |
||
+ | cat cat<n><sg> |
||
+ | $ echo "cats" | hfst-lookup test.mor.hfst |
||
+ | cats cat<n><pl> |
||
+ | </pre> |
||
[[Category:Lttoolbox]] |
[[Category:Lttoolbox]] |
||
+ | [[Category:HFST]] |
||
+ | [[Category:Documentation in English]] |
||
+ | [[Category:Lexc]] |
Latest revision as of 08:10, 30 December 2014
A .lexc
file defines how morphemes in the language are joined together, morphotactics.
Contents |
This page describes some how lttoolbox and HFST's lexc
are similar, so that people more familiar with one can get to grips more easily with the other.
Terminology[edit]
lttoolbox | lexc | Notes |
---|---|---|
Paradigm | Continuation lexicon | So each time you see LEXICON foo , think <pardef n="foo"/>
|
Section | Root lexicon | |
Left | Up | Both left and upper correspond to surface form |
Right | Down | Corresponds to lexical form |
Symbol | Multichar symbol | Sequences of one or more symbol which are treated as one symbol |
Example[edit]
lttoolbox[edit]
<dictionary> <sdefs> <sdef n="n"/> <sdef n="pl"/> <sdef n="sg"/> </sdefs> <pardefs> <pardef n="RegNounInfl"> <e><p><l/><r><s n="n"/><s n="sg"/></r></p></e> <e><p><l>s</l><r><s n="n"/><s n="pl"/></r></p></e> </pardef> </pardefs> <section id="Root" type="standard"> <e lm="cat"><i>cat</i><par n="RegNounInfl"/></e> <!-- A noun --> </section> </dictionary>
And to compile and use this dictionary:
$ lt-comp lr test.dix test.bin Root@standard 7 7 $ echo "cat" | lt-proc test.bin ^cat/cat<n><sg>$ $ echo "cats" | lt-proc test.bin ^cats/cat<n><pl>$
lexc[edit]
See also: Apertium-specific conventions for lexc
Multichar_Symbols %<n%> %<pl%> %<sg%> LEXICON Root NounRoot ; LEXICON NounRoot cat RegNounInfl ; ! A noun LEXICON RegNounInfl %<n%>%<sg%>: # ; %<n%>%<pl%>:s # ;
And to compile and use this dictionary:
$ hfst-lexc test.lexc -o test.gen.hfst $ hfst-invert -i test.gen.hfst -o test.mor.hfst $ echo "cat" | hfst-lookup test.mor.hfst cat cat<n><sg> $ echo "cats" | hfst-lookup test.mor.hfst cats cat<n><pl>