Difference between revisions of "Talk:Recursive transfer"

From Apertium
Jump to navigation Jump to search
(testing transfer4)
 
 
Line 33: Line 33:
 
<spectie> because they overlap in ambiguous ways [19:00]
 
<spectie> because they overlap in ambiguous ways [19:00]
 
</pre>
 
</pre>
  +
  +
==Old stuff==
  +
  +
===Deliverable 1===
  +
  +
* A program which reads a grammar using bison, parses a sentence and outputs the syntax tree as text, or graphViz or something.
  +
** See: https://svn.code.sf.net/p/apertium/svn/branches/transfer4/format-parse.py
  +
  +
===Deliverable 2===
  +
  +
* Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
  +
** The input would be ^sl/tl$ and the output would be ^tl$
  +
** The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
  +
  +
;Input:
  +
<pre>
  +
^Hau<prn><dem><sg>/This<prn><dem><sg>$
  +
^irabazle<n>/winner<n><ND>$
  +
^bat<num><sg>/a<det><ind><sg>$
  +
^en<post>/of<pr>$
  +
^historia<n>/story<n><ND>$
  +
^a<det><art><sg>/the<det><def><sg>$
  +
^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$
  +
^.<sent>/.<sent>$
  +
</pre>
  +
  +
;Output:
  +
<pre>
  +
^This<prn><dem><sg>$
  +
^be<vbser><pri><NR_HU>$
  +
^the<det><def><sg>$
  +
^story<n><ND>$
  +
^of<pr>$
  +
^a<det><ind><sg>$
  +
^winner<n><ND>$
  +
^.<sent>$
  +
</pre>
  +
  +
;Grammar
  +
  +
<pre>
  +
S -> SN SV sent { $1 $2 $3 }
  +
SV -> SN v { $2 $1 }
  +
SN -> N3 art { $2 $1 } | N3 { $1 }
  +
N3 -> SNGen N2 { $2 $1 } | N2 { $1 }
  +
N2 -> nom { $1 } | prn { $1 }
  +
SNGen -> SN genpost { $2 $1 }
  +
sent -> "sent" { $1 }
  +
v -> "vbser.*" { $1 } | "vblex.*" { $1 }
  +
art -> "det.art.*" { $1 } | "num.sg" { $1 }
  +
nom -> "n" { $1 }
  +
prn -> "prn.*" { $1 }
  +
</pre>
  +
  +
===Deliverable 3===
  +
  +
* An XML format for the rules, based on the current format, taking into account transfer operations

Latest revision as of 16:33, 17 April 2014

testing transfer4[edit]

In https://svn.code.sf.net/p/apertium/svn/branches/transfer4 you can do:

cd eng-kaz
make
# might give some warnings

# before and after transfer:
cat input/input.02.txt
cat input/input.02.txt | ./parser 

# /tmp/input.02.parse contains the parse, using parentheses
cat input/input.02.txt | ./parser -p 2>/tmp/input.02.parse

# visualise it with tex:
cat /tmp/input.02.parse | python3 ../brackets-parse.py -x >/tmp/input.02.tex
xelatex /tmp/input.02.tex

# or with graphviz:
cat /tmp/input.02.parse | python3 ../brackets-parse.py >/tmp/input.02.dot
dot -Tpdf /tmp/input.02.dot -o/tmp/input.02.pdf
<spectie> you can also generate a parser from an existing .t1x file 
<spectie> ../AST/create-parser.y
<spectie> ../AST/create-parser.py
<spectie> but it only does 1 level, and also, t1x rule patterns aren't
	  necessarily ideal  [18:59]
<spectie> $ python3 ../AST/create-parser.py <
	  ~/source/apertium/trunk/apertium-sme-nob/apertium-sme-nob.sme-nob.t1x 
<spectie> because they overlap in ambiguous ways  [19:00]

Old stuff[edit]

Deliverable 1[edit]

Deliverable 2[edit]

  • Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
    • The input would be ^sl/tl$ and the output would be ^tl$
    • The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
Input
^Hau<prn><dem><sg>/This<prn><dem><sg>$ 
^irabazle<n>/winner<n><ND>$ 
^bat<num><sg>/a<det><ind><sg>$ 
^en<post>/of<pr>$ 
^historia<n>/story<n><ND>$ 
^a<det><art><sg>/the<det><def><sg>$ 
^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$
^.<sent>/.<sent>$
Output
^This<prn><dem><sg>$ 
^be<vbser><pri><NR_HU>$
^the<det><def><sg>$ 
^story<n><ND>$ 
^of<pr>$ 
^a<det><ind><sg>$ 
^winner<n><ND>$ 
^.<sent>$
Grammar
S -> SN SV sent { $1 $2 $3 }
SV -> SN v { $2 $1 }
SN -> N3 art { $2 $1 } | N3 { $1 } 
N3 -> SNGen N2 { $2 $1 } | N2 { $1 } 
N2 -> nom { $1 } | prn { $1 } 
SNGen -> SN genpost { $2 $1 }
sent -> "sent" { $1 } 
v -> "vbser.*" { $1 } | "vblex.*" { $1 } 
art -> "det.art.*" { $1 } | "num.sg" { $1 } 
nom -> "n" { $1 } 
prn -> "prn.*" { $1 } 

Deliverable 3[edit]

  • An XML format for the rules, based on the current format, taking into account transfer operations