Difference between revisions of "Machine translation with Constraint Grammar"

From Apertium
Jump to navigation Jump to search
(Created page with ' '''Constraint Grammar''' is pretty flexible, it lets you shoot off your feet. ==Input== <pre> "<Í>" "í" Pr @ADVL→ #1->3 "<upphavi>" "upphav" N Neu Sg Dat …')
 
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{TOCD}}

'''Constraint Grammar''' is pretty flexible, it lets you shoot off your feet.
[[Constraint Grammar]] is pretty flexible, it lets you shoot off your feet.


==Input==
==Input==

The input is a standard CG format stream with dependency labels (this can also be with CG-proc and [[Apertium stream format]]).


<pre>
<pre>
"<Í>"
"<Í>"
"í" Pr @ADVL→ #1->3
"í" Pr @ADVL→ #1->3
"<upphavi>"
"<upphavi>"
"upphav" N Neu Sg Dat Indef @P← #2->1
"upphav" N Neu Sg Dat Indef @P← #2->1
"<skapti>"
"<skapti>"
"skapa" V Ind Prt Sg @VMAIN #3->0
"skapa" V Ind Prt Sg @VMAIN #3->0
"<Gud>"
"<Gud>"
"gudur" N Msc Sg Nom Indef @←SUBJ #4->3
"gudur" N Msc Sg Nom Indef @←SUBJ #4->3
"<himmal>"
"<himmal>"
"himmal" N Msc Sg Acc Indef @←OBJ #5->3
"himmal" N Msc Sg Acc Indef @←OBJ #5->3
"<og>"
"<og>"
"og" CC @CC #6->5
"og" CC @CC #6->5
"<jørð>"
"<jørð>"
"jørð" N Fem Sg Acc Indef @←OBJ #7->5
"jørð" N Fem Sg Acc Indef @←OBJ #7->5
"<.>"
"<.>"
"." CLB #8->0
"." CLB #8->0
</pre>
</pre>


Line 26: Line 28:


===Lexical===
===Lexical===

You can use some other system for lexical transfer (e.g. an Apertium bilingual dictionary), or you can do it directly in CG.


<pre>
<pre>
Line 41: Line 45:
===Movement===
===Movement===


Here we move a subject which is right of its main verb to the left (V2 → SVO).
<pre>


<pre>
$ cat /tmp/movement.cg
$ cat /tmp/movement.cg
SECTION


SECTION
MOVE WITHCHILD (*) (@←SUBJ) BEFORE (-1* (@VMAIN)) ;
MOVE WITHCHILD (*) (@←SUBJ) BEFORE (-1* (@VMAIN)) ;
SUBSTITUTE (@←SUBJ) (@SUBJ→) (@←SUBJ) (1 (@VMAIN)) ;
SUBSTITUTE (@←SUBJ) (@SUBJ→) (@←SUBJ) (1 (@VMAIN)) ;
</pre>

=== Generation ===

In this step we add the definite article before any definite NP.

<pre>
$ cat /tmp/generate.cg

SECTION
SUBSTITUTE (Indef) (Def) ("beginning") ;
ADDCOHORT ("<the>" "the" Det Def Sg) BEFORE (N Def) ;
</pre>
</pre>


===Morphological transfer===
===Morphological transfer===

We remove unused features like gender and definiteness.


<pre>
<pre>
$ cat /tmp/morphtrans.cg
$ cat /tmp/morphtrans.cg

SECTION
SECTION
SUBSTITUTE (Neu) (*) (Neu);
SUBSTITUTE (Neu) (*) (Neu);
Line 62: Line 82:
SUBSTITUTE (Acc) (*) (Acc);
SUBSTITUTE (Acc) (*) (Acc);
SUBSTITUTE (Indef) (*) (Indef);
SUBSTITUTE (Indef) (*) (Indef);
</pre>
...or...
<pre>
$ cat /tmp/morphtrans.cg

SECTION
LIST ToKill = Neu Fem Msc Nom Dat Acc Indef ;
SUBSTITUTE ToKill (*) $$ToKill ;
</pre>
</pre>


==Output==
==Output==

And finally run the whole thing.


<pre>
<pre>
$ cat /tmp/in | vislcg3 --grammar /tmp/movement.cg | vislcg3 --grammar /tmp/lexical_transfer.cg | vislcg3 --grammar /tmp/morphtrans.cg
$ cat /tmp/in | vislcg3 --grammar /tmp/movement.cg | vislcg3 --grammar /tmp/lexical_transfer.cg | vislcg3 --grammar /tmp/generate.cg | vislcg3 --grammar /tmp/morphtrans.cg
"<Í>"
"<Í>"
"in" Pr #1->4 @ADVL→
"in" Pr @ADVL→ #1->5
"<the>"
"the" Det Def Sg #2->2
"<upphavi>"
"<upphavi>"
"beginning" N Sg #2->1 @P←
"beginning" N Sg Def @P← #3->1
"<Gud>"
"<Gud>"
"god" N Sg #3->4 @SUBJ→
"god" N Sg @SUBJ→ #4->5
"<skapti>"
"<skapti>"
"create" V Ind Prt Sg #4->0 @VMAIN
"create" V Ind Prt Sg @VMAIN #5->0
"<himmal>"
"<himmal>"
"heaven" N Sg #5->4 @←OBJ
"heaven" N Sg @←OBJ #6->5
"<og>"
"<og>"
"and" CC #6->5 @CC
"and" CC @CC #7->6
"<jørð>"
"<jørð>"
"earth" N Sg #7->5 @←OBJ
"earth" N Sg @←OBJ #8->6
"<.>"
"<.>"
"." CLB #8->0
"." CLB #9->0

</pre>
</pre>


[[Category:Constraint grammar]]
[[Category:Constraint Grammar]]

Latest revision as of 11:35, 26 August 2011

Constraint Grammar is pretty flexible, it lets you shoot off your feet.

Input[edit]

The input is a standard CG format stream with dependency labels (this can also be with CG-proc and Apertium stream format).

"<Í>"
        "í" Pr @ADVL→ #1->3
"<upphavi>"
        "upphav" N Neu Sg Dat Indef @P← #2->1
"<skapti>"
        "skapa" V Ind Prt Sg @VMAIN #3->0
"<Gud>"
        "gudur" N Msc Sg Nom Indef @←SUBJ #4->3
"<himmal>"
        "himmal" N Msc Sg Acc Indef @←OBJ #5->3
"<og>"
        "og" CC @CC #6->5
"<jørð>"
        "jørð" N Fem Sg Acc Indef @←OBJ #7->5
"<.>"
        "." CLB #8->0

Grammars[edit]

Lexical[edit]

You can use some other system for lexical transfer (e.g. an Apertium bilingual dictionary), or you can do it directly in CG.

$ cat /tmp/lexical_transfer.cg 
SECTION
SUBSTITUTE ("í") ("in") ("í");
SUBSTITUTE ("upphav") ("beginning") ("upphav");
SUBSTITUTE ("himmal") ("heaven") ("himmal");
SUBSTITUTE ("og") ("and") ("og");
SUBSTITUTE ("jørð") ("earth") ("jørð");
SUBSTITUTE ("skapa") ("create") ("skapa");
SUBSTITUTE ("gudur") ("god") ("gudur");

Movement[edit]

Here we move a subject which is right of its main verb to the left (V2 → SVO).

$ cat /tmp/movement.cg 

SECTION
MOVE WITHCHILD (*) (@←SUBJ) BEFORE (-1* (@VMAIN)) ;
SUBSTITUTE (@←SUBJ) (@SUBJ→) (@←SUBJ) (1 (@VMAIN)) ;

Generation[edit]

In this step we add the definite article before any definite NP.

$ cat /tmp/generate.cg

SECTION
SUBSTITUTE (Indef) (Def) ("beginning") ;
ADDCOHORT ("<the>" "the" Det Def Sg) BEFORE (N Def) ;

Morphological transfer[edit]

We remove unused features like gender and definiteness.

$ cat /tmp/morphtrans.cg 

SECTION
SUBSTITUTE (Neu) (*) (Neu);
SUBSTITUTE (Fem) (*) (Fem);
SUBSTITUTE (Msc) (*) (Msc);
SUBSTITUTE (Nom) (*) (Nom);
SUBSTITUTE (Dat) (*) (Dat);
SUBSTITUTE (Acc) (*) (Acc);
SUBSTITUTE (Indef) (*) (Indef);

...or...

$ cat /tmp/morphtrans.cg 

SECTION
LIST ToKill = Neu Fem Msc Nom Dat Acc Indef ;
SUBSTITUTE ToKill (*) $$ToKill ;

Output[edit]

And finally run the whole thing.

$ cat /tmp/in | vislcg3 --grammar /tmp/movement.cg | vislcg3 --grammar /tmp/lexical_transfer.cg | vislcg3 --grammar /tmp/generate.cg | vislcg3 --grammar /tmp/morphtrans.cg 
"<Í>"
	"in" Pr @ADVL→ #1->5
"<the>"
	"the" Det Def Sg #2->2
"<upphavi>"
	"beginning" N Sg Def @P← #3->1
"<Gud>"
	"god" N Sg @SUBJ→ #4->5
"<skapti>"
	"create" V Ind Prt Sg @VMAIN #5->0
"<himmal>"
	"heaven" N Sg @←OBJ #6->5
"<og>"
	"and" CC @CC #7->6
"<jørð>"
	"earth" N Sg @←OBJ #8->6
"<.>"
	"." CLB #9->0