Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Become a language pair developer for Apertium

From Apertium
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
{{TOCD}}This is a 3-part, step-by-step guide on how to use a development version of Apertium to make a change in a language pair. These instructions assume that you are using Ubuntu or Debian; if not then please see the [[Installation]] page for instructions for other OS's such as Mac OS X or Windows.
+
=== Step 1: Adding to the First Dictionary ===
   
== Intro ==
+
When adding entries, you have to enter the lemma (word as you would read it in a dictionary),the part between <nowiki><i></nowiki> and <nowiki></i></nowiki> that contains the prefix of the word that is common to all inflected forms, and the element in <par> that refers to the inflection paradigm of this word. All entries will have a basic structure like:
 
There are 2 options as to how you can get Apertium. You can use either the Terminal to get the most up-to-date versions or the Synaptic Package Manager can be used to get development versions that aren't quite as up-to-date. There are pros and cons to each, however, the Terminal method is more for those that intend to submit their work while using the package manager is normally easier and you are using a graphical interface instead of a command line.
 
 
'''YOU DON'T NEED TO KNOW A PROGRAMMING LANGUAGE TO DEVELOP FOR APERTIUM. All development for adding new words is done with a text editor.'''
 
 
== Getting Ready ==
 
 
=== Method 1: TERMINAL ===
 
 
==== Step 1: Get the Prerequisites ====
 
 
A development version of Apertium and the language pair you want to change has to be installed on your computer first before you can change something about the language pair.
 
 
Start by opening a new Terminal.
 
 
Then, use this command to install the prerequisites:
 
 
<pre>
 
<pre>
sudo apt-get install subversion build-essential g++ pkg-config gawk libxml2 \
+
<e lm="(lemma)">
> libxml2-dev libxml2-utils xsltproc flex automake autoconf libtool libpcre3-dev
+
<i>(prefix)</i>
  +
<par n="(paradigm)"/>
  +
</e>
 
</pre>
 
</pre>
   
The Terminal will then ask you for your password to begin.
 
   
''Note: Keep track of how you type your password in your head. The Terminal will not display characters entered for your password for security reasons.''
+
Start by opening your first language's dictionary file. For example: apertium-en-es.es.dix (an XML file).
   
After you have entered your password, press the "Enter" key and wait for your computer to download and install the packages.
+
Then, create a new entry with the basic structure.
 
==== Step 2: Get Apertium, lttoolbox, & Your Language Pair(s) ====
 
 
Using the same Terminal, you can download the entire language pairs tree from SVN using the command:
 
<pre>
 
svn co https://apertium.svn.sourceforge.net/svnroot/apertium
 
</pre>
 
Keep in mind that the full tree is over 4GB. If you have a slow connection, limited disk space, or a limited data transfer amount, installing the whole tree is not recommended. However, if you want to focus on a small number of language pairs you can easily download and add language pairs with a command such as:
 
<pre>
 
svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/<modulename>
 
</pre>
 
''Note: This command only downloads one module at a time. For a more complete set of instructions on effectively using SVN, see the [[Using SVN]] page.''
 
 
 
In the area where it says <modulename>, you can replace this with the module that you want to use such as the Spanish/English module.
 
 
These next commands download Apertium, lttoolbox, and the language pair that you want to use.
 
<pre>
 
svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/lttoolbox
 
svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium
 
svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/<modulename>
 
</pre>
 
 
 
For example, if you wanted to get Apertium, lttoolbox, and the Spanish/English module you could enter:
 
<pre>
 
svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/lttoolbox
 
svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium
 
svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-en-es
 
</pre>
 
 
''Note: You can find a full list of modules at [https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/ https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/].''
 
 
==== Final Step: Compilation & Installation ====
 
 
First, you need to compile lttoolbox, Apertium, and your language pair and install them. For this we will use,
 
<pre>
 
cd apertium
 
cd lttoolbox/
 
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./autogen.sh
 
make
 
sudo make install
 
sudo ldconfig
 
</pre>
 
, for lttoolbox. Then,
 
<pre>
 
cd ..
 
cd apertium/
 
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./autogen.sh
 
make
 
sudo make install
 
sudo ldconfig
 
</pre>
 
for Apertium. And finally,
 
<pre>
 
cd ..
 
cd <modulename>/
 
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./autogen.sh
 
make
 
sudo make install
 
</pre>
 
,for your language pair which replaces the text <modulename>.
 
 
For further instruction, if necessary, see [[Apertium on Ubuntu]].
 
 
=== Method 2: PACKAGE MANAGER ===
 
 
Using the Synaptic Package Manager to download and install Apertium, lttoolbox, and your language pair is considerably easier than the Terminal method, however, your choice of language pairs is limited, you may be unable to commit changes, and there may possibly be other flaws or minor bugs.
 
 
==== Step 1: Find your Packages ====
 
 
To begin, start by finding the Synaptic Package Manager and opening it.
 
 
Then, use the search box and type in "apertium".
 
 
Synaptic should bring up a list of everything related to Apertium. This list should include language pairs, the development versions of lttoolbox and libapertium, as well as the Apertium base package and various others.
 
 
==== Final Step: Compilation & Installation ====
 
 
Luckily, Synaptic takes care of getting the prerequisites, dependencies, and other required packages. All you have to do is select which packages you need and have Synaptic download and install them.
 
 
Start by selecting the "apertium" checkbox and choose "Mark for Installation" from the drop-down menu.
 
 
Synaptic will inform you of Apertium's dependencies and will ask if you want to mark them as well. Click "Mark" in the lower-right of the pop-up box.
 
 
The required packages (lttoolbox, libapertim, and liblttoolbox) will now be marked as well.
 
 
Now you can select your language pair. ''Note: Some language pairs aren't available through this method. Those that are available include: en-es fr-es es-pt es-ca es-gl pt-gl eo-ca eo-es en-ca oc-es fr-ca es-ro eu-es oc-ca.''
 
 
Download and install the selected packages.
 
 
Synaptic will inform you when it is done.
 
 
Now you can install the development packages (libapertium3-3.1-0-dev and liblttoolbox3-3.1-0-dev) using the same procedures.
 
 
'''IMPORTANT: AVAILABLE VERSIONS OF PACKAGES MAY BE LIMITED BY WHAT VERSION OF YOUR OS YOU ARE RUNNING.'''
 
 
== Changing Things Around ==
 
 
When you want to make a change in Apertium, you more than likely want to add a word to an existing language pair.
 
 
'''IMPORTANT: Adding a word won't do you any good if you don't recompile the modules after the change is made. Simply use the Terminal like before and enter:''' ''make <modulenamehere>'' '''and press the "Enter" key and your computer will create the new files necessary.'''
 
 
 
There are 3 major steps in adding a new word to a language pair:
 
 
'''1.''' Add an entry to the dictionary for the first language that will be used.
 
 
'''2.''' Add an entry to the bilingual dictionary for the pair.
 
 
'''3.''' Add an entry to the dictionary for the second language that will be used.
 
 
You will need to find the module you want to work with on your computer and open the three dictionaries; for example: apertium-es-ca.es.dix,
 
apertium-es-ca.es-ca.dix, and apertium-es-ca.ca.dix. ''Note: Each dictionary will have the suffix ".dix"'' You should open these files in a text editor or specialized XML editor.
 
 
See also: [[Contributing to an existing pair]]
 
 
=== Step 1: Adding to the First Dictionary ===
 
   
When adding entries, you have to enter the lemma(word as you would read it),the part between <nowiki><i></nowiki> and <nowiki></i></nowiki> that contains the prefix of the word that is common to all inflected forms, and the element <par> refers to the inflection paradigm of this word.
+
Now, between the quotes in the area where it says "(lemma)" replace (lemma) with your word. ''Note: Do not include () in entries, but place input between "".''
   
Firstly, open your first language's dictionary file.
+
Next, you can enter the prefix in the space between <nowiki><i></nowiki> and <nowiki></i></nowiki> and replace (prefix).
   
Then,
+
Finally, enter the paradigm in <par> between the quotations. The paradigm should consist of the prefix of another word that has the same inflection and is already in the dictionary and has the morphological analysis: adj m sg, adj f sg, adj m pl and adj f pl respectively. For example: <par n="absolut/o__adj"/>
   
== Show it to the World ==
+
Now, save your altered dictionary, and '''DO NOT''' change file name, directory, or file type.
   
'''IN PROGRESS'''
+
To finish, use the Terminal and enter ''make <apertiummodule>''. Replace <apertiummodule> with your module name. For example: ''make apertium-en-es''. Now press the "Enter" key and allow you computer to recompile the module with the changes you just made.

Revision as of 22:06, 18 December 2011

Step 1: Adding to the First Dictionary

When adding entries, you have to enter the lemma (word as you would read it in a dictionary),the part between <i> and </i> that contains the prefix of the word that is common to all inflected forms, and the element in <par> that refers to the inflection paradigm of this word. All entries will have a basic structure like:

      <e lm="(lemma)">
        <i>(prefix)</i>
        <par n="(paradigm)"/>
      </e>


Start by opening your first language's dictionary file. For example: apertium-en-es.es.dix (an XML file).

Then, create a new entry with the basic structure.

Now, between the quotes in the area where it says "(lemma)" replace (lemma) with your word. Note: Do not include () in entries, but place input between "".

Next, you can enter the prefix in the space between <i> and </i> and replace (prefix).

Finally, enter the paradigm in <par> between the quotations. The paradigm should consist of the prefix of another word that has the same inflection and is already in the dictionary and has the morphological analysis: adj m sg, adj f sg, adj m pl and adj f pl respectively. For example: <par n="absolut/o__adj"/>

Now, save your altered dictionary, and DO NOT change file name, directory, or file type.

To finish, use the Terminal and enter make <apertiummodule>. Replace <apertiummodule> with your module name. For example: make apertium-en-es. Now press the "Enter" key and allow you computer to recompile the module with the changes you just made.

Personal tools