Talk:Recursive transfer
Jump to navigation
Jump to search
testing transfer4
In https://svn.code.sf.net/p/apertium/svn/branches/transfer4 you can do:
cd eng-kaz make # might give some warnings # before and after transfer: cat input/input.02.txt cat input/input.02.txt | ./parser # /tmp/input.02.parse contains the parse, using parentheses cat input/input.02.txt | ./parser -p 2>/tmp/input.02.parse # visualise it with tex: cat /tmp/input.02.parse | python3 ../brackets-parse.py -x >/tmp/input.02.tex xelatex /tmp/input.02.tex # or with graphviz: cat /tmp/input.02.parse | python3 ../brackets-parse.py >/tmp/input.02.dot dot -Tpdf /tmp/input.02.dot -o/tmp/input.02.pdf
<spectie> you can also generate a parser from an existing .t1x file <spectie> ../AST/create-parser.y <spectie> ../AST/create-parser.py <spectie> but it only does 1 level, and also, t1x rule patterns aren't necessarily ideal [18:59] <spectie> $ python3 ../AST/create-parser.py < ~/source/apertium/trunk/apertium-sme-nob/apertium-sme-nob.sme-nob.t1x <spectie> because they overlap in ambiguous ways [19:00]
Old stuff
Deliverable 1
- A program which reads a grammar using bison, parses a sentence and outputs the syntax tree as text, or graphViz or something.
Deliverable 2
- Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
- The input would be ^sl/tl$ and the output would be ^tl$
- The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
- Input
^Hau<prn><dem><sg>/This<prn><dem><sg>$ ^irabazle<n>/winner<n><ND>$ ^bat<num><sg>/a<det><ind><sg>$ ^en<post>/of<pr>$ ^historia<n>/story<n><ND>$ ^a<det><art><sg>/the<det><def><sg>$ ^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$ ^.<sent>/.<sent>$
- Output
^This<prn><dem><sg>$ ^be<vbser><pri><NR_HU>$ ^the<det><def><sg>$ ^story<n><ND>$ ^of<pr>$ ^a<det><ind><sg>$ ^winner<n><ND>$ ^.<sent>$
- Grammar
S -> SN SV sent { $1 $2 $3 } SV -> SN v { $2 $1 } SN -> N3 art { $2 $1 } | N3 { $1 } N3 -> SNGen N2 { $2 $1 } | N2 { $1 } N2 -> nom { $1 } | prn { $1 } SNGen -> SN genpost { $2 $1 } sent -> "sent" { $1 } v -> "vbser.*" { $1 } | "vblex.*" { $1 } art -> "det.art.*" { $1 } | "num.sg" { $1 } nom -> "n" { $1 } prn -> "prn.*" { $1 }
Deliverable 3
- An XML format for the rules, based on the current format, taking into account transfer operations