Replacement for flag diacritics
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
		
		
		
		
		
		
	
People like to use flag diacritics for stuff. But they are bad because they are ugly and get in the way of stuff.
Alternative: Use distinct symbols with well defined behaviours and finite-state operations!
We have < and > for morphological tags, and { and } for archiphonemes and morphological features. We add a new type of symbol with [ and ] for modelling morphotactic restrictions.
Contents
Examples
Turkish
Multichar_Symbols %<v%> %<cop%> %<tv%> %<aor%> %<prog%> %<p1%> %<p3%> %<sg%> %[%-aor%] %[%+aor%] %+ ; LEXICON Root Verbs ; LEXICON PERS %<p1%>%<sg%>:im # ; %<p3%>%<sg%>: # ; LEXICON COP %+i%<cop%>%<aor%>%[%+aor%]: PERS ; LEXICON V-TV %<v%>%<tv%>%<aor%>%[%+aor%]:ir PERS ; %<v%>%<tv%>%<aor%>%[%+aor%]:ir COP ; %<v%>%<tv%>%<prog%>%[%-aor%]:iyor COP ; LEXICON Verbs bil:bil V-TV ; ! ""
Alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z %<v%> %<tv%> %<prog%> %<aor%> %<p1%> %<p2%> %<p3%> %<sg%> %<cop%> %[%+aor%]:0 %[%-aor%]:0 ; Sets Verb = %<v%> ; Rules "No consecutive [+aor] tags" %[%+aor%]:0 /<= %[%+aor%]:0 :* _ ;
$ hfst-lexc test.lexc | hfst-invert -o test.hfst $ hfst-twolc test-const.twol -o const.hfst $ hfst-compose-intersect -1 test.hfst -2 const.hfst | hfst-fst2strings biliyorim:bil<v><tv><prog>+i<cop><aor><p1><sg> biliyor:bil<v><tv><prog>+i<cop><aor><p3><sg> bilirim:bil<v><tv><aor><p1><sg> bilir:bil<v><tv><aor><p3><sg>
Persian
Multichar_Symbols %<v%> %<tv%> %<pri%> %<cni%> %<prs%> %<p1%> %<p3%> %<sg%> %[%-prs%] %[%+prs%] %[%-cni%] %[%+cni%] %+ LEXICON Root Prefix ; LEXICON Prefix %[%+prs%]%[%-cni%]:be Verbs ; %[%-prs%]%[%+cni%]:mi Verbs ; %[%-prs%]%[%-cni%]: Verbs ; LEXICON PERS %<p1%>%<sg%>:im # ; %<p3%>%<sg%>: # ; LEXICON V-TV %<v%>%<tv%>%<cni%>%[%+cni%]%[%-prs%]: PERS ; %<v%>%<tv%>%<prs%>%[%-cni%]%[%+prs%]: PERS ; %<v%>%<tv%>%<pri%>%[%-cni%]%[%-prs%]: PERS ; LEXICON Verbs kardan:kard V-TV ; ! ""
$ cat prefix-const.twol 
Alphabet
 a b c d e f g h i j k l m n o p q r s t u v w x y z 
 %<v%> %<tv%> %<p1%> %<p3%> %<sg%> %<pri%> %<prs%> %<cni%>
 %[%+prs%]:0  %[%-prs%]:0  %[%+cni%]:0  %[%-cni%]:0 
;
Sets 
Verb = %<v%> ;
Rules 
"Match prefixes"
Tx:0 /<= Ty:0 :* _ ; 
   where 
         Tx in ( %[%+cni%] %[%+prs%] %[%-cni%] %[%-prs%] )   
         Ty in ( %[%-cni%] %[%-prs%] %[%+cni%] %[%+prs%] )  matched ; 
$ hfst-lexc prefix.lexc | hfst-invert -o prefix.hfst $ hfst-twolc prefix-const.twol -o prefix-const.hfst $ hfst-compose-intersect -1 prefix.hfst -2 prefix-const.hfst | hfst-fst2strings bekardim:kardan<v><tv><prs><p1><sg> bekard:kardan<v><tv><prs><p3><sg> kardim:kardan<v><tv><pri><p1><sg> kard:kardan<v><tv><pri><p3><sg> mikardim:kardan<v><tv><cni><p1><sg> mikard:kardan<v><tv><cni><p3><sg>

