Replacement for flag diacritics
Revision as of 15:48, 14 June 2014 by Francis Tyers (talk | contribs)
People like to use flag diacritics for stuff. But they are bad because they are ugly and get in the way of stuff.
Alternative: Use symbols and finite-state operations!
We have <
and >
for morphological tags, and {
and }
for archiphonemes and morphological features. We add a new type of symbol with [
and ]
for modelling morphotactic restrictions.
Example
Multichar_Symbols %<v%> %<cop%> %<tv%> %<aor%> %<prog%> %<p1% %<p3%> %<sg%> %[%-aor%] %[%+aor%] %+ LEXICON Root Verbs ; LEXICON PERS %<p1%>%<sg%>:im # ; %<p3%>%<sg%>: # ; LEXICON COP %+i%<cop%>%<aor%>%[%+aor%]: PERS ; LEXICON V-TV %<v%>%<tv%>%<aor%>%[%+aor%]:ar PERS ; %<v%>%<tv%>%<aor%>%[%+aor%]:ar COP ; %<v%>%<tv%>%<prog%>%[%-aor%]:iyor COP ; LEXICON Verbs bil:bil V-TV ; ! ""
Alphabet b i l m i y o r u m %<v%> %<tv%> %<prog%> %<aor%> %<p1%> %<p2%> %<p3%> %<sg%> %<cop%> %[%+aor%]:0 %[%-aor%]:0 ; Sets Verb = %<v%> ; Rules "No consecutive +aor tags" %[%+aor%]:0 /<= %[%+aor%]:0 :* _ ;
$ hfst-lexc test.lexc | hfst-invert -o test.hfst $ hfst-twolc test-const.twol -o const.hfst $ hfst-compose-intersect -1 test.hfst -2 const.hfst | hfst-fst2strings biliyorim:bil<v><tv><prog>+i<cop><aor><p1><sg> biliyor:bil<v><tv><prog>+i<cop><aor><p3><sg> bilarim:bil<v><tv><aor><p1><sg> bilar:bil<v><tv><aor><p3><sg>