Difference between revisions of "Talk:Recursive transfer"
Jump to navigation
Jump to search
(testing transfer4) |
|||
Line 33: | Line 33: | ||
<spectie> because they overlap in ambiguous ways [19:00] |
<spectie> because they overlap in ambiguous ways [19:00] |
||
</pre> |
</pre> |
||
==Old stuff== |
|||
===Deliverable 1=== |
|||
* A program which reads a grammar using bison, parses a sentence and outputs the syntax tree as text, or graphViz or something. |
|||
** See: https://svn.code.sf.net/p/apertium/svn/branches/transfer4/format-parse.py |
|||
===Deliverable 2=== |
|||
* Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes |
|||
** The input would be ^sl/tl$ and the output would be ^tl$ |
|||
** The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled. |
|||
;Input: |
|||
<pre> |
|||
^Hau<prn><dem><sg>/This<prn><dem><sg>$ |
|||
^irabazle<n>/winner<n><ND>$ |
|||
^bat<num><sg>/a<det><ind><sg>$ |
|||
^en<post>/of<pr>$ |
|||
^historia<n>/story<n><ND>$ |
|||
^a<det><art><sg>/the<det><def><sg>$ |
|||
^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$ |
|||
^.<sent>/.<sent>$ |
|||
</pre> |
|||
;Output: |
|||
<pre> |
|||
^This<prn><dem><sg>$ |
|||
^be<vbser><pri><NR_HU>$ |
|||
^the<det><def><sg>$ |
|||
^story<n><ND>$ |
|||
^of<pr>$ |
|||
^a<det><ind><sg>$ |
|||
^winner<n><ND>$ |
|||
^.<sent>$ |
|||
</pre> |
|||
;Grammar |
|||
<pre> |
|||
S -> SN SV sent { $1 $2 $3 } |
|||
SV -> SN v { $2 $1 } |
|||
SN -> N3 art { $2 $1 } | N3 { $1 } |
|||
N3 -> SNGen N2 { $2 $1 } | N2 { $1 } |
|||
N2 -> nom { $1 } | prn { $1 } |
|||
SNGen -> SN genpost { $2 $1 } |
|||
sent -> "sent" { $1 } |
|||
v -> "vbser.*" { $1 } | "vblex.*" { $1 } |
|||
art -> "det.art.*" { $1 } | "num.sg" { $1 } |
|||
nom -> "n" { $1 } |
|||
prn -> "prn.*" { $1 } |
|||
</pre> |
|||
===Deliverable 3=== |
|||
* An XML format for the rules, based on the current format, taking into account transfer operations |
Latest revision as of 16:33, 17 April 2014
testing transfer4[edit]
In https://svn.code.sf.net/p/apertium/svn/branches/transfer4 you can do:
cd eng-kaz make # might give some warnings # before and after transfer: cat input/input.02.txt cat input/input.02.txt | ./parser # /tmp/input.02.parse contains the parse, using parentheses cat input/input.02.txt | ./parser -p 2>/tmp/input.02.parse # visualise it with tex: cat /tmp/input.02.parse | python3 ../brackets-parse.py -x >/tmp/input.02.tex xelatex /tmp/input.02.tex # or with graphviz: cat /tmp/input.02.parse | python3 ../brackets-parse.py >/tmp/input.02.dot dot -Tpdf /tmp/input.02.dot -o/tmp/input.02.pdf
<spectie> you can also generate a parser from an existing .t1x file <spectie> ../AST/create-parser.y <spectie> ../AST/create-parser.py <spectie> but it only does 1 level, and also, t1x rule patterns aren't necessarily ideal [18:59] <spectie> $ python3 ../AST/create-parser.py < ~/source/apertium/trunk/apertium-sme-nob/apertium-sme-nob.sme-nob.t1x <spectie> because they overlap in ambiguous ways [19:00]
Old stuff[edit]
Deliverable 1[edit]
- A program which reads a grammar using bison, parses a sentence and outputs the syntax tree as text, or graphViz or something.
Deliverable 2[edit]
- Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
- The input would be ^sl/tl$ and the output would be ^tl$
- The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
- Input
^Hau<prn><dem><sg>/This<prn><dem><sg>$ ^irabazle<n>/winner<n><ND>$ ^bat<num><sg>/a<det><ind><sg>$ ^en<post>/of<pr>$ ^historia<n>/story<n><ND>$ ^a<det><art><sg>/the<det><def><sg>$ ^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$ ^.<sent>/.<sent>$
- Output
^This<prn><dem><sg>$ ^be<vbser><pri><NR_HU>$ ^the<det><def><sg>$ ^story<n><ND>$ ^of<pr>$ ^a<det><ind><sg>$ ^winner<n><ND>$ ^.<sent>$
- Grammar
S -> SN SV sent { $1 $2 $3 } SV -> SN v { $2 $1 } SN -> N3 art { $2 $1 } | N3 { $1 } N3 -> SNGen N2 { $2 $1 } | N2 { $1 } N2 -> nom { $1 } | prn { $1 } SNGen -> SN genpost { $2 $1 } sent -> "sent" { $1 } v -> "vbser.*" { $1 } | "vblex.*" { $1 } art -> "det.art.*" { $1 } | "num.sg" { $1 } nom -> "n" { $1 } prn -> "prn.*" { $1 }
Deliverable 3[edit]
- An XML format for the rules, based on the current format, taking into account transfer operations