https://wiki.apertium.org/w/api.php?action=feedcontributions&user=116.90.224.115&feedformat=atomApertium - User contributions [en]2024-03-29T07:51:24ZUser contributionsMediaWiki 1.34.1https://wiki.apertium.org/w/index.php?title=Crossdics&diff=8871Crossdics2008-11-28T08:23:39Z<p>116.90.224.115: </p>
<hr />
<div>{{TOCD}}<br />
{{main|Building dictionaries}}<br />
<br />
'''Crossdics''' (part of [[apertium-dixtools]]) is a program that can be used to "cross" language pairs. That is, given language pairs <code>aa-bb</code> and <code>bb-cc</code> it will create a new language pair for <code>aa-cc</code>.<br />
<br />
== Installing ==<br />
See [[apertium-dixtools]].<br />
<br />
== Using apertium-crossdics ==<br />
<br />
$ apertium-crossdics<br />
<br />
== Crossing dictionaries ==<br />
<br />
=== Using a Linguistic Resources Document ===<br />
<br />
You can define a [[Linguistic Resources Document]] (LRD) and use it to indicate which dictionaries will be used for crossing:<br />
<br />
$ apertium-crossdics -f my-linguistic-resources.xml sl-tl<br />
<br />
Therefore, only 2 parameters are needed:<br />
* '''my-linguistic-resources.xml''': a document specifying a set of linguistic resources (dictionaries, cross models, corpora, other LRD files, etc).<br />
* '''sl-tl''': source language (sl) and target language (tl).<br />
<br />
Note that this form uses the <code>apertium-dixtools</code> script (the <code>apertium-crossdics</code> script still uses the "old" form)<br />
<br />
=== Without a Linguistic Resources Document ===<br />
<br />
First of all, copy linguistic data into folder "dics"<br />
<br />
* Bilingual dictionary A-B: <code>apertium-bb-aa.bb-aa.dix</code><br />
* Bilingual dictionary B-C: <code>apertium-bb-cc.bb-cc.dix</code><br />
* Morphological dictionary A: <code>apertium-bb-aa.aa.dix</code><br />
* Morphological dictionary C: <code>apertium-bb-cc.cc.dix</code><br />
<br />
<br />
Please note that:<br />
* all dictionaries must be in the form:<br />
** <code>apertium-xx-yy.xx-yy.dix</code> (bilingual dictionaries)<br />
** <code>apertium-xx-yy.xx.dix</code> (morphological dictionaries)<br />
* the common language (B) must be in the left side, that is, dictionaries in the form B-A and B-C<br />
* use "-r" instead of "-n" if the dictionary has to be [[Reverse a dictionary|reversed]] (apertium-aa-bb.aa-bb.dix to apertium-bb-aa.bb-aa.dix)<br />
<br />
<br />
Use the '''apertium-dixtools''' script to cross the dictionaries:<br />
<br />
$ apertium-dixtools cross-param '''monA.dix''' -n '''bilAB.dix''' -n '''bilBC-dix''' '''monC.dix'''<br />
<br />
An example crossing '''es-ca''' and '''es-pt''' to get the '''ca-pt''' pair.<br />
<br />
<pre><br />
$ apertium-dixtools cross-param dics/apertium-es-ca.ca.dix -n dics/apertium-es-ca.es-ca.dix -n dics/apertium-es-pt.es-pt.dix dics/apertium-es-pt.pt.dix<br />
</pre><br />
<br />
== Customizing cross actions ==<br />
<br />
By default, the crossdics tool uses a simple cross model defining very simple rules for crossing two sets of dictionaries. However, more specific cross actions might be needed in order to cross certain [[List of language pairs|language pairs]] correctly. [[Cross Model|Defining a new cross schema]] with concrete pattern-action elements solves this problem.<br />
<br />
== See also ==<br />
* [[Crossdics Example|Crossing language pairs: a full example]]<br />
* [[Linguistic Resources Document|How to create a Linguistic Resources Document]]<br />
* [[Cross Model|How to define a new cross schema]]<br />
* [[List of language pairs|List of available language pairs]]<br />
* [[Sort a dictionary|How to sort a dictionary]]<br />
* [[Merge dictionaries|How to merge dictionaries]]<br />
<!--<br />
* [[Reverse a dictionary|How to reverse a bilingual dictionary]]<br />
--><br />
<br />
[[Category:Documentation]]<br />
[[Category:Tools]]<br />
[[Category:Development]]</div>116.90.224.115https://wiki.apertium.org/w/index.php?title=Apertium-dixtools&diff=8870Apertium-dixtools2008-11-28T08:23:35Z<p>116.90.224.115: </p>
<hr />
<div>{{TOCD}}<br />
{{see|Crossdics}}<br />
<br />
<br />
== Download ==<br />
<br />
<pre><br />
$ svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-dixtools<br />
</pre><br />
<br />
== Software prerequisites ==<br />
<br />
You will need to install [http://ant.apache.org/ Ant] and [http://java.sun.com/javase/downloads/index.jsp Java Development Kit 6 (JDK6)]<br />
<br />
$ sudo apt-get install ant sun-java6-jdk<br />
<br />
== Compiling ==<br />
<br />
<pre><br />
$ cd apertium-dixtools<br />
$ ant jar<br />
</pre><br />
<br />
== Installing ==<br />
$ sudo ant install<br />
<br />
<br />
<br />
= Notes for developers =<br />
== Wishlist and notes for Apertium-dixtools ==<br />
<br />
* theres awful lot of code, much more than needed. another way of handling XML where you dont have to write classes (and formatting code!!) for each tag.<br />
<br />
There should be many more options, and ALL sub-commands should take a -fmt parameter where all could be specified:<br />
* 1line or multiline entries<br />
* indenting<br />
* also 1line on pardefs<br />
* multiwords -- one line or many lines<br />
* multiwords -- should they be separated<br />
(because sometimes with complex multiwords you want to have them laid out differently and apart<br />
e.g. you have a section for verbs and it has first "simple" verbs, then it has the multiword verbs)<br />
* multiwords -- the simple verbs are one per line<br />
* multiwords -- and the multiword verbs are over several lines</div>116.90.224.115