Difference between revisions of "Replacement for flag diacritics"
Jump to navigation
Jump to search
Line 31: | Line 31: | ||
LEXICON V-TV |
LEXICON V-TV |
||
%<v%>%<tv%>%<aor%>%[%+aor%]: |
%<v%>%<tv%>%<aor%>%[%+aor%]:ir PERS ; |
||
%<v%>%<tv%>%<aor%>%[%+aor%]: |
%<v%>%<tv%>%<aor%>%[%+aor%]:ir COP ; |
||
%<v%>%<tv%>%<prog%>%[%-aor%]:iyor COP ; |
%<v%>%<tv%>%<prog%>%[%-aor%]:iyor COP ; |
||
Line 57: | Line 57: | ||
Rules |
Rules |
||
"No consecutive +aor tags" |
"No consecutive [+aor] tags" |
||
%[%+aor%]:0 /<= %[%+aor%]:0 :* _ ; |
%[%+aor%]:0 /<= %[%+aor%]:0 :* _ ; |
||
</pre> |
</pre> |
||
Line 68: | Line 68: | ||
biliyorim:bil<v><tv><prog>+i<cop><aor><p1><sg> |
biliyorim:bil<v><tv><prog>+i<cop><aor><p1><sg> |
||
biliyor:bil<v><tv><prog>+i<cop><aor><p3><sg> |
biliyor:bil<v><tv><prog>+i<cop><aor><p3><sg> |
||
bilirim:bil<v><tv><aor><p1><sg> |
|||
bilir:bil<v><tv><aor><p3><sg> |
|||
</pre> |
</pre> |
Revision as of 15:53, 14 June 2014
People like to use flag diacritics for stuff. But they are bad because they are ugly and get in the way of stuff.
Alternative: Use symbols and finite-state operations!
We have <
and >
for morphological tags, and {
and }
for archiphonemes and morphological features. We add a new type of symbol with [
and ]
for modelling morphotactic restrictions.
Example
Multichar_Symbols %<v%> %<cop%> %<tv%> %<aor%> %<prog%> %<p1% %<p3%> %<sg%> %[%-aor%] %[%+aor%] %+ LEXICON Root Verbs ; LEXICON PERS %<p1%>%<sg%>:im # ; %<p3%>%<sg%>: # ; LEXICON COP %+i%<cop%>%<aor%>%[%+aor%]: PERS ; LEXICON V-TV %<v%>%<tv%>%<aor%>%[%+aor%]:ir PERS ; %<v%>%<tv%>%<aor%>%[%+aor%]:ir COP ; %<v%>%<tv%>%<prog%>%[%-aor%]:iyor COP ; LEXICON Verbs bil:bil V-TV ; ! ""
Alphabet b i l m i y o r u m %<v%> %<tv%> %<prog%> %<aor%> %<p1%> %<p2%> %<p3%> %<sg%> %<cop%> %[%+aor%]:0 %[%-aor%]:0 ; Sets Verb = %<v%> ; Rules "No consecutive [+aor] tags" %[%+aor%]:0 /<= %[%+aor%]:0 :* _ ;
$ hfst-lexc test.lexc | hfst-invert -o test.hfst $ hfst-twolc test-const.twol -o const.hfst $ hfst-compose-intersect -1 test.hfst -2 const.hfst | hfst-fst2strings biliyorim:bil<v><tv><prog>+i<cop><aor><p1><sg> biliyor:bil<v><tv><prog>+i<cop><aor><p3><sg> bilirim:bil<v><tv><aor><p1><sg> bilir:bil<v><tv><aor><p3><sg>