Difference between revisions of "Crossdics"
Line 9: | Line 9: | ||
== Using apertium-crossdics == |
== Using apertium-crossdics == |
||
$ apertium- |
$ apertium-dixtools cross |
||
== Crossing dictionaries == |
== Crossing dictionaries == |
||
Line 17: | Line 17: | ||
You can define a [[Linguistic Resources Document]] (LRD) and use it to indicate which dictionaries will be used for crossing: |
You can define a [[Linguistic Resources Document]] (LRD) and use it to indicate which dictionaries will be used for crossing: |
||
$ apertium- |
$ apertium-dixtools cross -f my-linguistic-resources.xml sl-tl |
||
Therefore, only 2 parameters are needed: |
Therefore, only 2 parameters are needed: |
||
Line 23: | Line 23: | ||
* '''sl-tl''': source language (sl) and target language (tl). |
* '''sl-tl''': source language (sl) and target language (tl). |
||
Note that this form uses the <code>apertium-dixtools</code> script (the <code>apertium-crossdics</code> script still uses the "old" form) |
|||
=== Without a Linguistic Resources Document === |
=== Without a Linguistic Resources Document === |
||
Line 64: | Line 63: | ||
* [[Sort a dictionary|How to sort a dictionary]] |
* [[Sort a dictionary|How to sort a dictionary]] |
||
* [[Merge dictionaries|How to merge dictionaries]] |
* [[Merge dictionaries|How to merge dictionaries]] |
||
<!-- |
|||
* [[Reverse a dictionary|How to reverse a bilingual dictionary]] |
* [[Reverse a dictionary|How to reverse a bilingual dictionary]] |
||
--> |
|||
Revision as of 03:54, 10 April 2009
- Main article: Building dictionaries
Crossdics (part of apertium-dixtools) is a program that can be used to "cross" language pairs. That is, given language pairs aa-bb
and bb-cc
it will create a new language pair for aa-cc
.
Installing
See apertium-dixtools.
Using apertium-crossdics
$ apertium-dixtools cross
Crossing dictionaries
Using a Linguistic Resources Document
You can define a Linguistic Resources Document (LRD) and use it to indicate which dictionaries will be used for crossing:
$ apertium-dixtools cross -f my-linguistic-resources.xml sl-tl
Therefore, only 2 parameters are needed:
- my-linguistic-resources.xml: a document specifying a set of linguistic resources (dictionaries, cross models, corpora, other LRD files, etc).
- sl-tl: source language (sl) and target language (tl).
Without a Linguistic Resources Document
First of all, copy linguistic data into folder "dics"
- Bilingual dictionary A-B:
apertium-bb-aa.bb-aa.dix
- Bilingual dictionary B-C:
apertium-bb-cc.bb-cc.dix
- Morphological dictionary A:
apertium-bb-aa.aa.dix
- Morphological dictionary C:
apertium-bb-cc.cc.dix
Please note that:
- all dictionaries must be in the form:
apertium-xx-yy.xx-yy.dix
(bilingual dictionaries)apertium-xx-yy.xx.dix
(morphological dictionaries)
- the common language (B) must be in the left side, that is, dictionaries in the form B-A and B-C
- use "-r" instead of "-n" if the dictionary has to be reversed (apertium-aa-bb.aa-bb.dix to apertium-bb-aa.bb-aa.dix)
Use the apertium-dixtools script to cross the dictionaries:
$ apertium-dixtools cross-param monA.dix -n bilAB.dix -n bilBC-dix monC.dix
An example crossing es-ca and es-pt to get the ca-pt pair.
$ apertium-dixtools cross-param dics/apertium-es-ca.ca.dix -n dics/apertium-es-ca.es-ca.dix -n dics/apertium-es-pt.es-pt.dix dics/apertium-es-pt.pt.dix
Customizing cross actions
By default, the crossdics tool uses a simple cross model defining very simple rules for crossing two sets of dictionaries. However, more specific cross actions might be needed in order to cross certain language pairs correctly. Defining a new cross schema with concrete pattern-action elements solves this problem.