Difference between revisions of "Crossdics Example"

From Apertium
Jump to navigation Jump to search
Line 27: Line 27:
* '''Bilingual AB''': apertium-en-es.en-es.dix
* '''Bilingual AB''': apertium-en-es.en-es.dix
* '''Bilingual BC''': apertium-es-gl.es-gl.dix
* '''Bilingual BC''': apertium-es-gl.es-gl.dix

=== Cross Model Document ===

You will also need a:
* '''Cross Model ABC''': cross-model-en-es-gl.dix
* '''Cross Model ABC''': cross-model-en-es-gl.dix


A cross model document must contain at least a default pattern-action. Here you have a very simple example of '''cross-model-en-es-gl.xml''' document.
All this data must be specified in a Linguistic Resources Document.

<?xml version="1.0" encoding="UTF-8"?>
<cross-model>
<cross-action id="default" a="ebenimeli">
<description>'''Default pattern'''</description>
<pattern>
<e>
&lt;p>
<l>'''$lemmaA'''<v n="'''cat'''"/><t n="'''tailA'''"/></l>
<r>'''$lemmaB'''<v n="'''cat'''"/><t/></r>
</p>
</e>
<e>
&lt;p>
<l>'''$lemmaB'''<v n="'''cat'''"/><t/></l>
<r>'''$lemmaC'''<v n="'''cat'''"/><t n="'''tailC'''"/></r>
</p>
</e>
</pattern>
<action-set>
<action>
<e>
&lt;p>
<l>'''$lemmaA'''<v n="'''cat'''"/><t n="'''tailA'''"/></l>
<r>'''$lemmaC'''<v n="'''cat'''"/><t n="'''tailC'''"/></r>
</p>
</e>
</action>
</action-set>
</cross-action>
</cross-model>

Read more in [[Cross Model|How to create a Cross Model Document]].


=== Linguistic Resources Document ===
=== Linguistic Resources Document ===

Finally, you it is necessary to define a document with references to the linguistic resources needed.


Example of '''en-es-gl-ling-resources.xml''' document:
Example of '''en-es-gl-ling-resources.xml''' document:
Line 119: Line 158:


Read more about [[Linguistic Resources Document|How to create a Linguistic Resources Document]]
Read more about [[Linguistic Resources Document|How to create a Linguistic Resources Document]]

=== Cross Model Document ===

A cross model document must contain at least a default pattern-action. Here you have a very simple example of '''cross-model-en-es-gl.xml''' document.

<?xml version="1.0" encoding="UTF-8"?>
<cross-model>
<cross-action id="default" a="ebenimeli">
<description>'''Default pattern'''</description>
<pattern>
<e>
&lt;p>
<l>'''$lemmaA'''<v n="'''cat'''"/><t n="'''tailA'''"/></l>
<r>'''$lemmaB'''<v n="'''cat'''"/><t/></r>
</p>
</e>
<e>
&lt;p>
<l>'''$lemmaB'''<v n="'''cat'''"/><t/></l>
<r>'''$lemmaC'''<v n="'''cat'''"/><t n="'''tailC'''"/></r>
</p>
</e>
</pattern>
<action-set>
<action>
<e>
&lt;p>
<l>'''$lemmaA'''<v n="'''cat'''"/><t n="'''tailA'''"/></l>
<r>'''$lemmaC'''<v n="'''cat'''"/><t n="'''tailC'''"/></r>
</p>
</e>
</action>
</action-set>
</cross-action>
</cross-model>

Read more in [[Cross Model|How to create a Cross Model Document]].


== Crossing dictionaries ==
== Crossing dictionaries ==

Revision as of 16:33, 11 March 2008

Obtaining apertium-crossdics

Download

$ svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-crossdics

You will need to install Ant and Java Development Kit 6 (JDK6)

$ sudo apt-get install ant sun-java6-jdk

Compiling and installing

$ cd apertium-crossdics
$ ant jar
$ sudo ant install

What you need

Dictionaries

To get an English-Galician (A-C) language pair from English-Spanish (A-B) and Spanish-Galician (B-C), you need at least this linguistic data:

  • Morphological A: apertium-en-es.en.dix
  • Morphological C: apertium-es-gl.gl.dix
  • Bilingual AB: apertium-en-es.en-es.dix
  • Bilingual BC: apertium-es-gl.es-gl.dix

Cross Model Document

You will also need a:

  • Cross Model ABC: cross-model-en-es-gl.dix

A cross model document must contain at least a default pattern-action. Here you have a very simple example of cross-model-en-es-gl.xml document.

<?xml version="1.0" encoding="UTF-8"?>
<cross-model>
  <cross-action id="default" a="ebenimeli">
    <description>Default pattern</description>
    <pattern>
      <e>
        <p>
          <l>$lemmaA<v n="cat"/><t n="tailA"/></l>
          <r>$lemmaB<v n="cat"/><t/></r>

      </e>
      <e>
        <p>
          <l>$lemmaB<v n="cat"/><t/></l>
          <r>$lemmaC<v n="cat"/><t n="tailC"/></r>

      </e>
    </pattern>
    <action-set>
      <action>
        <e>
          <p>
            <l>$lemmaA<v n="cat"/><t n="tailA"/></l>
            <r>$lemmaC<v n="cat"/><t n="tailC"/></r>

        </e>
      </action>
    </action-set>
  </cross-action>
</cross-model>

Read more in How to create a Cross Model Document.

Linguistic Resources Document

Finally, you it is necessary to define a document with references to the linguistic resources needed.

Example of en-es-gl-ling-resources.xml document:

<?xml version="1.0" encoding="UTF-8"?>

<!-- Linguistic resources-->
<ling-resources>
   <name>My linguistic resources</name>
   <description>My linguistics resources: morphological and bilingual dictionaries, cross models, corpora, etc.</description>
   
   <resource-set>

      <name>My linguistic resources to get English-Galician language pair.</name>
      <description>Morphological and bilingual dictionaries.</description>

      <!-- cross model en-es-gl -->
      <resource>
         <property name="name" value="cross-model-en-es-gl"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="en"/>
         <property name="tl" value="gl"/>      
         <property name="for-crossing" value="yes"/>
         <property name="src" value="cross-model-en-es-gl.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- cross model gl-es-en -->
      <resource>
         <property name="name" value="cross-model-gl-es-en"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="gl"/>
         <property name="tl" value="en"/>      
         <property name="for-crossing" value="yes"/>
         <!-- note that we use the same cross model file -->
         <property name="src" value="cross-model-en-es-gl.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'en' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-en"/>
         <property name="type" value="mon"/>
         <property name="sl" value="en"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-ca.en.metadix"/>
         <property name="version" value="stable"/>
      </resource>

      <!-- 'gl' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-gl"/>
         <property name="type" value="mon"/>
         <property name="sl" value="gl"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-gl.gl.dix"/>
         <property name="version" value="stable"/>
      </resource>

      <!-- 'en-es' bilingual dictionary -->
      <resource>
         <property name="name" value="apertium-es-ca"/>
         <property name="type" value="bil"/>
         <property name="sl" value="en"/>
         <property name="tl" value="es"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-es.en-es.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'es-gl' bilingual dictionary -->   
      <resource>
         <property name="name" value="apertium-es-gl"/>
         <property name="type" value="bil"/>
         <property name="sl" value="es"/>
         <property name="tl" value="gl"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-gl.es-gl.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
   </resource-set>
   
</ling-resources>

Read more about How to create a Linguistic Resources Document

Crossing dictionaries

To use apertium-crossdics:

$ apertium-crossdics -f en-es-gl-ling-resources.xml en-gl

You will get the crossed dictionaries in dix folder.

dix/apertium-en-gl.en-gl-crossed.dix
dix/apertium-en-gl.en-crossed.dix
dix/apertium-en-gl.gl-crossed.dix