Difference between revisions of "Crossdics"

From Apertium
Jump to navigation Jump to search
Line 9: Line 9:
== Using apertium-crossdics ==
== Using apertium-crossdics ==


$ apertium-crossdics
$ apertium-dixtools cross


== Crossing dictionaries ==
== Crossing dictionaries ==
Line 17: Line 17:
You can define a [[Linguistic Resources Document]] (LRD) and use it to indicate which dictionaries will be used for crossing:
You can define a [[Linguistic Resources Document]] (LRD) and use it to indicate which dictionaries will be used for crossing:


$ apertium-crossdics -f my-linguistic-resources.xml sl-tl
$ apertium-dixtools cross -f my-linguistic-resources.xml sl-tl


Therefore, only 2 parameters are needed:
Therefore, only 2 parameters are needed:
Line 23: Line 23:
* '''sl-tl''': source language (sl) and target language (tl).
* '''sl-tl''': source language (sl) and target language (tl).


Note that this form uses the <code>apertium-dixtools</code> script (the <code>apertium-crossdics</code> script still uses the "old" form)


=== Without a Linguistic Resources Document ===
=== Without a Linguistic Resources Document ===
Line 64: Line 63:
* [[Sort a dictionary|How to sort a dictionary]]
* [[Sort a dictionary|How to sort a dictionary]]
* [[Merge dictionaries|How to merge dictionaries]]
* [[Merge dictionaries|How to merge dictionaries]]
<!--
* [[Reverse a dictionary|How to reverse a bilingual dictionary]]
* [[Reverse a dictionary|How to reverse a bilingual dictionary]]
-->





Revision as of 03:54, 10 April 2009

Main article: Building dictionaries

Crossdics (part of apertium-dixtools) is a program that can be used to "cross" language pairs. That is, given language pairs aa-bb and bb-cc it will create a new language pair for aa-cc.

Installing

See apertium-dixtools.

Using apertium-crossdics

$ apertium-dixtools cross

Crossing dictionaries

Using a Linguistic Resources Document

You can define a Linguistic Resources Document (LRD) and use it to indicate which dictionaries will be used for crossing:

$ apertium-dixtools cross -f my-linguistic-resources.xml sl-tl

Therefore, only 2 parameters are needed:

  • my-linguistic-resources.xml: a document specifying a set of linguistic resources (dictionaries, cross models, corpora, other LRD files, etc).
  • sl-tl: source language (sl) and target language (tl).


Without a Linguistic Resources Document

First of all, copy linguistic data into folder "dics"

  • Bilingual dictionary A-B: apertium-bb-aa.bb-aa.dix
  • Bilingual dictionary B-C: apertium-bb-cc.bb-cc.dix
  • Morphological dictionary A: apertium-bb-aa.aa.dix
  • Morphological dictionary C: apertium-bb-cc.cc.dix


Please note that:

  • all dictionaries must be in the form:
    • apertium-xx-yy.xx-yy.dix (bilingual dictionaries)
    • apertium-xx-yy.xx.dix (morphological dictionaries)
  • the common language (B) must be in the left side, that is, dictionaries in the form B-A and B-C
  • use "-r" instead of "-n" if the dictionary has to be reversed (apertium-aa-bb.aa-bb.dix to apertium-bb-aa.bb-aa.dix)


Use the apertium-dixtools script to cross the dictionaries:

$ apertium-dixtools cross-param monA.dix -n bilAB.dix -n bilBC-dix monC.dix

An example crossing es-ca and es-pt to get the ca-pt pair.

$ apertium-dixtools cross-param dics/apertium-es-ca.ca.dix -n dics/apertium-es-ca.es-ca.dix -n dics/apertium-es-pt.es-pt.dix dics/apertium-es-pt.pt.dix

Customizing cross actions

By default, the crossdics tool uses a simple cross model defining very simple rules for crossing two sets of dictionaries. However, more specific cross actions might be needed in order to cross certain language pairs correctly. Defining a new cross schema with concrete pattern-action elements solves this problem.

See also