Difference between revisions of "Beginner's Constraint Grammar HOWTO"
Line 31: | Line 31: | ||
Now we can install and CG. |
Now we can install and CG. |
||
==Install== |
==Install== |
||
;Apertium |
;Apertium |
||
Line 93: | Line 93: | ||
We are ready. |
We are ready. |
||
=Usage= |
|||
Let's try that what we installed.First copy/paste thix code: |
|||
'''$ echo "vino a la playa" | lt-proc es-ca.automorf.bin''' |
|||
'''^vino/vino<n><m><sg>/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>/lo<prn><pro><p3><f><sg>$ ^playa/playa<n><f><sg>$''' |
|||
Here we have ambiguities,one between a noun and a verb and other between a determiner and a pronoun.We can write some rules which can impose categorization between two ambiguities.First we define our categories, these can be tags, wordforms or lemmas. It might help to think of them as "coarse tags", which may involve a set of fine tags or lemmas. So, create a file grammar.txt, and add the following text: |
|||
'''DELIMITERS = "<$.>" ;''' |
|||
'''LIST NOUN = n;''' |
|||
'''LIST VERB = vblex;''' |
|||
'''LIST DET = det;''' |
|||
'''LIST PRN = prn;''' |
|||
'''LIST PREP = pr;''' |
|||
'''SECTION''' |
|||
So first rule is states "When the current lexical unit can be a pronoun or a determiner, and it is followed on the right by a lexical unit which could be a noun, choose the determiner". We have to add this rule to the file, and compile using cg-comp: |
|||
rule: |
|||
'''# 1''' |
|||
'''SELECT DET IF''' |
|||
''' (0 DET)''' |
|||
''' (0 PRN)''' |
|||
''' (1 NOUN) ;''' |
|||
adding: |
|||
'''$ ./cg-comp grammar.txt grammar.bin''' |
|||
'''Sections: 1, Rules: 1, Sets: 6, Tags: 7''' |
|||
To try what we have done copy/paste this code: |
|||
$ echo "vino a la playa" | lt-proc es-ca.automorf.bin | cg-proc grammar.bin |
|||
^vino/vino<n><m><sg>/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>$ ^playa/playa<n><f><sg>$ |
|||
Second rule is states "When the current lexical unit can be a noun or a verb, if the subsequent two units to the right are preposition and determiner, remove the noun reading." Now we have to add this rule: |
|||
rule: |
|||
'''# 2''' |
|||
'''REMOVE NOUN IF''' |
|||
''' (0 NOUN)''' |
|||
''' (0 VERB)''' |
|||
''' (1 PREP)''' |
|||
''' (2 DET) ;''' |
|||
re-compile the grammar and test: |
|||
'''$ echo "vino a la playa" | lt-proc es-ca.automorf.bin | cg-proc grammar.bin''' |
|||
'''^vino/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>$ ^playa/playa<n><f><sg>$''' |
|||
[[Category:Documentation]] |
[[Category:Documentation]] |
Revision as of 22:19, 1 December 2010
Download
- Apertium
How to download Apertium for Ubunto. First open your terminal and copy/paste
First we have to install prerequisites.
- Open terminal and copy/paste this code :
sudo apt-get install subversion build-essential g++ pkg-config libxml2 \
libxml2-dev libxml2-utils xsltproc flex automake autoconf libtool libpcre3-dev
- Then terminal will ask for your password like this: [sudo] password for user.When you write it press Enter.
If you have already prerequisites, it will show you X upgraded, X newly installed, X to remove and X not upgraded. If you don't have it,you have to wait until terminal show you user@ubuntu:~$.This mean the process is ready(downaload and instal prerequisites) and terminal wait for your next step, which is to copy/paste this code:
svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk apertium
This will download apertium from SVN.The process will take a few minutes. When the downloading ends we are ready to install apertium.
- Constraint grammar
To use CG we must have lttoolbox(we have it),apertium(we have it too) and ICU(we have to install it now).
How to install ICU for Ubunto. Open terminal and copy/paste this code:
apt-get install libicu-dev
Now we can install and CG.
Install
- Apertium
Before installing apertium we have to install lttoolbox(which has been downloaded whit apertium at same time).To do that you have to copy/paste this code:
cd apertium
cd lttoolbox/
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./autogen.sh
make
sudo make install
sudo ldconfig
Terminal will ask us for password again [sudo] password for user: When you write it press Enter.
Wait to show you terminal user@ubuntu:~/apertium/lttoolbox$ then copy/paste this code:
cd ..
cd apertium/
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./autogen.sh
make
sudo make install
sudo ldconfig
This will start installing apertium.You have to wait a few minutes.When shows you
vasil@ubuntu:~/apertium/apertium$ sudo ldconfig
vasil@ubuntu:~/apertium/apertium$
the process is ready.
- Constraint grammar
How to install CG.Open terminal and copy/paste this code:
$ svn co --username anonymous --password anonymous http://beta.visl.sdu.dk/svn/visl/tools/vislcg3/trunk vislcg3
$ cd vislcg3
$ sh autogen.sh --prefix=<prefix>
$ make
$ make install
It will ask you for password [sudo] password for user: . When you write it press Enter.
We are ready.
Usage
Let's try that what we installed.First copy/paste thix code:
$ echo "vino a la playa" | lt-proc es-ca.automorf.bin
^vino/vino<n><m><sg>/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>/lo<prn><pro><p3><f><sg>$ ^playa/playa<n><f><sg>$
Here we have ambiguities,one between a noun and a verb and other between a determiner and a pronoun.We can write some rules which can impose categorization between two ambiguities.First we define our categories, these can be tags, wordforms or lemmas. It might help to think of them as "coarse tags", which may involve a set of fine tags or lemmas. So, create a file grammar.txt, and add the following text:
DELIMITERS = "<$.>" ;
LIST NOUN = n;
LIST VERB = vblex;
LIST DET = det;
LIST PRN = prn;
LIST PREP = pr;
SECTION
So first rule is states "When the current lexical unit can be a pronoun or a determiner, and it is followed on the right by a lexical unit which could be a noun, choose the determiner". We have to add this rule to the file, and compile using cg-comp:
rule:
# 1
SELECT DET IF
(0 DET)
(0 PRN)
(1 NOUN) ;
adding:
$ ./cg-comp grammar.txt grammar.bin
Sections: 1, Rules: 1, Sets: 6, Tags: 7
To try what we have done copy/paste this code:
$ echo "vino a la playa" | lt-proc es-ca.automorf.bin | cg-proc grammar.bin ^vino/vino<n><m><sg>/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>$ ^playa/playa<n><f><sg>$
Second rule is states "When the current lexical unit can be a noun or a verb, if the subsequent two units to the right are preposition and determiner, remove the noun reading." Now we have to add this rule:
rule:
# 2
REMOVE NOUN IF
(0 NOUN)
(0 VERB)
(1 PREP)
(2 DET) ;
re-compile the grammar and test:
$ echo "vino a la playa" | lt-proc es-ca.automorf.bin | cg-proc grammar.bin
^vino/venir<vblex><ifi><p3><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>$ ^playa/playa<n><f><sg>$