Difference between revisions of "Talk:Infrastructure discussion"
Jump to navigation
Jump to search
(New page: <pre> -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo 0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100% Dan Dan+N+Prop+Mal+Sg+Attr Dan Dan+N+Prop+Mal+Sg+Acc Dan Dan+N+Prop+Mal+Sg+Gen Dan D...) |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
This is how things happen at Tromsø |
|||
<pre> |
<pre> |
||
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo |
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo |
||
Line 28: | Line 30: | ||
(18:27:25) ttrosterud: -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg |
(18:27:25) ttrosterud: -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg |
||
"<Dan>" |
"<Dan>" |
||
"dat" Pron Pers Sg3 Acc |
"dat" Pron Pers Sg3 Acc |
||
"dat" Pron Dem Sg Acc |
"dat" Pron Dem Sg Acc |
||
"Dan" N Prop Mal Sg Gen |
"Dan" N Prop Mal Sg Gen |
||
"Dan" N Prop Mal Sg Attr |
"Dan" N Prop Mal Sg Attr |
||
"dat" Pron Pers Sg3 Gen |
"dat" Pron Pers Sg3 Gen |
||
"D" N ACR Ess |
"D" N ACR Ess |
||
"Dan" N Prop Mal Sg Acc |
"Dan" N Prop Mal Sg Acc |
||
"dat" Pron Dem Sg Gen |
"dat" Pron Dem Sg Gen |
||
"Dan" N Prop Mal Sg Nom |
"Dan" N Prop Mal Sg Nom |
||
"<mun>" |
"<mun>" |
||
"mun" Pron Pers Sg1 Nom |
"mun" Pron Pers Sg1 Nom |
||
"<lean>" |
"<lean>" |
||
"leat" V IV Ind Prs Sg1 |
"leat" V IV Ind Prs Sg1 |
||
"leat" V IV PrfPrc |
"leat" V IV PrfPrc |
||
"<dahkan>" |
"<dahkan>" |
||
"dahkat" V TV Actio Gen |
"dahkat" V TV Actio Gen |
||
"dahkat" V TV PrfPrc |
"dahkat" V TV PrfPrc |
||
"dahkat" V* TV Der3 Der/n N Sg Nom |
"dahkat" V* TV Der3 Der/n N Sg Nom |
||
"dahkat" V TV Actio Acc |
"dahkat" V TV Actio Acc |
||
"dahkat" V* TV Der3 Der/n N Sg Gen |
"dahkat" V* TV Der3 Der/n N Sg Gen |
||
"dahkat" V TV Actio Nom |
"dahkat" V TV Actio Nom |
||
"<.>" |
"<.>" |
||
"." CLB |
"." CLB |
||
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg |
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg |
||
"<Dan>" |
"<Dan>" |
||
"dat" Pron Pers Sg3 Acc |
"dat" Pron Pers Sg3 Acc |
||
"dat" Pron Dem Sg Acc |
"dat" Pron Dem Sg Acc |
||
"Dan" N Prop Mal Sg Gen |
"Dan" N Prop Mal Sg Gen |
||
"Dan" N Prop Mal Sg Attr |
"Dan" N Prop Mal Sg Attr |
||
"dat" Pron Pers Sg3 Gen |
"dat" Pron Pers Sg3 Gen |
||
"D" N ACR Ess |
"D" N ACR Ess |
||
"Dan" N Prop Mal Sg Acc |
"Dan" N Prop Mal Sg Acc |
||
"dat" Pron Dem Sg Gen |
"dat" Pron Dem Sg Gen |
||
"Dan" N Prop Mal Sg Nom |
"Dan" N Prop Mal Sg Nom |
||
"<mun>" |
"<mun>" |
||
"mun" Pron Pers Sg1 Nom |
"mun" Pron Pers Sg1 Nom |
||
"<lean>" |
"<lean>" |
||
"leat" V IV Ind Prs Sg1 |
"leat" V IV Ind Prs Sg1 |
||
"leat" V IV PrfPrc |
"leat" V IV PrfPrc |
||
"<dahkan>" |
"<dahkan>" |
||
"dahkat" V TV Actio Gen |
"dahkat" V TV Actio Gen |
||
"dahkat" V TV PrfPrc |
"dahkat" V TV PrfPrc |
||
"dahkat" V* TV Der3 Der/n N Sg Nom |
"dahkat" V* TV Der3 Der/n N Sg Nom |
||
"dahkat" V TV Actio Acc |
"dahkat" V TV Actio Acc |
||
"dahkat" V* TV Der3 Der/n N Sg Gen |
"dahkat" V* TV Der3 Der/n N Sg Gen |
||
"dahkat" V TV Actio Nom |
"dahkat" V TV Actio Nom |
||
"<.>" |
"<.>" |
||
"." CLB |
"." CLB |
||
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle |
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle |
||
Line 85: | Line 87: | ||
25 rules cannot be skipped by index. |
25 rules cannot be skipped by index. |
||
"<Dan>" |
"<Dan>" |
||
"dat" Pron Pers Sg3 Acc @OBJ |
|||
"<mun>" |
"<mun>" |
||
"mun" Pron Pers Sg1 Nom @SUBJ |
|||
"<lean>" |
"<lean>" |
||
"leat" V IV Ind Prs Sg1 @+FAUXV |
|||
"<dahkan>" |
"<dahkan>" |
||
"dahkat" V TV PrfPrc @-FMAINV |
|||
"<.>" |
"<.>" |
||
"." CLB |
|||
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle | vislcg3 -g gt/sme/src/sme-dep.rle |
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle | vislcg3 -g gt/sme/src/sme-dep.rle |
||
Line 107: | Line 109: | ||
25 rules cannot be skipped by index. |
25 rules cannot be skipped by index. |
||
"<Dan>" |
"<Dan>" |
||
"dat" Pron Pers Sg3 Acc @OBJ #1->4 |
|||
"<mun>" |
"<mun>" |
||
"mun" Pron Pers Sg1 Nom @SUBJ #2->3 |
|||
"<lean>" |
"<lean>" |
||
"leat" <aux> V IV Ind Prs Sg1 @FS-STA #3->0 |
|||
"<dahkan>" |
"<dahkan>" |
||
"dahkat" <mv> V TV PrfPrc @ICL-AUX< #4->3 |
|||
"<.>" |
"<.>" |
||
"." CLB #5->0 |
|||
</pre> |
|||
==Multiwords in xfst== |
|||
<pre> |
|||
(09:56:30) spectre: quick question... how do you deal with multiword units in xfst? (e.g. lemmas where there is a space in the middle "United Kingdom<PN>" |
|||
(09:58:04) ttrosterud: two ways |
|||
(09:58:15) ttrosterud: the preprocessor must know them |
|||
(09:58:17) ttrosterud: so: |
|||
(09:58:22) ttrosterud: New% York |
|||
(09:58:29) ttrosterud: where % literalizes the space |
|||
(09:58:39) ttrosterud: sorry that was in xfst |
|||
(09:58:47) spectre: ok |
|||
(09:58:49) ttrosterud: in the preprocessor it myst be given as |
|||
(09:59:00) ttrosterud: I |
|||
live |
|||
in |
|||
New York |
|||
(09:59:09) ttrosterud: then in xfst (or rather in lexc) |
|||
(09:59:11) ttrosterud: I write |
|||
(09:59:30) ttrosterud: New% York namelex ; |
|||
London namelex ; |
|||
(09:59:31) ttrosterud: etc |
|||
</pre> |
</pre> |
Latest revision as of 11:42, 23 April 2008
This is how things happen at Tromsø
-bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo 0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100% Dan Dan+N+Prop+Mal+Sg+Attr Dan Dan+N+Prop+Mal+Sg+Acc Dan Dan+N+Prop+Mal+Sg+Gen Dan Dan+N+Prop+Mal+Sg+Nom Dan dat+Pron+Dem+Sg+Acc Dan dat+Pron+Dem+Sg+Gen Dan dat+Pron+Pers+Sg3+Acc Dan dat+Pron+Pers+Sg3+Gen Dan D+N+ACR+Ess mun mun+Pron+Pers+Sg1+Nom lean leat+V+IV+Ind+Prs+Sg1 lean leat+V+IV+PrfPrc dahkan dahkat+V+TV+Actio+Acc dahkan dahkat+V+TV+Actio+Gen dahkan dahkat+V+TV+Actio+Nom dahkan dahkat+V+TV+PrfPrc dahkan dahkat+V+TV+Der3+Der/n+N+Sg+Gen dahkan dahkat+V+TV+Der3+Der/n+N+Sg+Nom . .+CLB (18:27:25) ttrosterud: -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg "<Dan>" "dat" Pron Pers Sg3 Acc "dat" Pron Dem Sg Acc "Dan" N Prop Mal Sg Gen "Dan" N Prop Mal Sg Attr "dat" Pron Pers Sg3 Gen "D" N ACR Ess "Dan" N Prop Mal Sg Acc "dat" Pron Dem Sg Gen "Dan" N Prop Mal Sg Nom "<mun>" "mun" Pron Pers Sg1 Nom "<lean>" "leat" V IV Ind Prs Sg1 "leat" V IV PrfPrc "<dahkan>" "dahkat" V TV Actio Gen "dahkat" V TV PrfPrc "dahkat" V* TV Der3 Der/n N Sg Nom "dahkat" V TV Actio Acc "dahkat" V* TV Der3 Der/n N Sg Gen "dahkat" V TV Actio Nom "<.>" "." CLB -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg "<Dan>" "dat" Pron Pers Sg3 Acc "dat" Pron Dem Sg Acc "Dan" N Prop Mal Sg Gen "Dan" N Prop Mal Sg Attr "dat" Pron Pers Sg3 Gen "D" N ACR Ess "Dan" N Prop Mal Sg Acc "dat" Pron Dem Sg Gen "Dan" N Prop Mal Sg Nom "<mun>" "mun" Pron Pers Sg1 Nom "<lean>" "leat" V IV Ind Prs Sg1 "leat" V IV PrfPrc "<dahkan>" "dahkat" V TV Actio Gen "dahkat" V TV PrfPrc "dahkat" V* TV Der3 Der/n N Sg Nom "dahkat" V TV Actio Acc "dahkat" V* TV Der3 Der/n N Sg Gen "dahkat" V TV Actio Nom "<.>" "." CLB -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle VISL CG-3 Disambiguator version 0.9.3.3362 Codepage: default UTF-8, input UTF-8, output UTF-8, grammar UTF-8 Parsing grammar took 0.657588 seconds. Grammar has 27 sections, 3284 rules, 3658 sets, 8514 tags. 25 rules cannot be skipped by index. "<Dan>" "dat" Pron Pers Sg3 Acc @OBJ "<mun>" "mun" Pron Pers Sg1 Nom @SUBJ "<lean>" "leat" V IV Ind Prs Sg1 @+FAUXV "<dahkan>" "dahkat" V TV PrfPrc @-FMAINV "<.>" "." CLB -bash-3.00$ echo "Dan mun lean dahkan." | preprocess | lo | lookup2cg | vislcg3 -g gt/sme/src/sme-dis.rle | vislcg3 -g gt/sme/src/sme-dep.rle VISL CG-3 Disambiguator version 0.9.3.3362 Codepage: default UTF-8, input UTF-8, output UTF-8, grammar UTF-8 VISL CG-3 Disambiguator version 0.9.3.3362 Codepage: default UTF-8, input UTF-8, output UTF-8, grammar UTF-8 Parsing grammar took 0.088736 seconds. Grammar has 2 sections, 57 rules, 835 sets, 7955 tags. Grammar has dependency rules. Parsing grammar took 0.65936 seconds. Grammar has 27 sections, 3284 rules, 3658 sets, 8514 tags. 25 rules cannot be skipped by index. "<Dan>" "dat" Pron Pers Sg3 Acc @OBJ #1->4 "<mun>" "mun" Pron Pers Sg1 Nom @SUBJ #2->3 "<lean>" "leat" <aux> V IV Ind Prs Sg1 @FS-STA #3->0 "<dahkan>" "dahkat" <mv> V TV PrfPrc @ICL-AUX< #4->3 "<.>" "." CLB #5->0
Multiwords in xfst[edit]
(09:56:30) spectre: quick question... how do you deal with multiword units in xfst? (e.g. lemmas where there is a space in the middle "United Kingdom<PN>" (09:58:04) ttrosterud: two ways (09:58:15) ttrosterud: the preprocessor must know them (09:58:17) ttrosterud: so: (09:58:22) ttrosterud: New% York (09:58:29) ttrosterud: where % literalizes the space (09:58:39) ttrosterud: sorry that was in xfst (09:58:47) spectre: ok (09:58:49) ttrosterud: in the preprocessor it myst be given as (09:59:00) ttrosterud: I live in New York (09:59:09) ttrosterud: then in xfst (or rather in lexc) (09:59:11) ttrosterud: I write (09:59:30) ttrosterud: New% York namelex ; London namelex ; (09:59:31) ttrosterud: etc