Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Dixtools: Enhance

From Apertium
(Difference between revisions)
Jump to: navigation, search
 
(One intermediate revision by one user not shown)
Line 14: Line 14:
 
the name we want the new (enhanced) dict to be saved
 
the name we want the new (enhanced) dict to be saved
 
ALERT: the tool simply overrides whatever file exists with that name
 
ALERT: the tool simply overrides whatever file exists with that name
  +
  +
Then you'll be dropped into an interactive session where you can type a word you want to add, followed by a comma and then word which exists in the dictionary and has the paradigm you want to use. Example session:
  +
  +
<pre>
  +
$ java -jar dist/apertium-dixtools.jar enhance apertium-nno.nno.dix new.dix
  +
Reading file 'apertium-nno.nno.dix'
  +
'enhance' method
  +
Enter the word you want to add to the dictionaries,
  +
followed by a word already in the dictionaries separated by ','
  +
  +
(enter --exit to finish)
  +
ostekaffi
  +
Wrong format
  +
Enter the word you want to add to the dictionaries,
  +
followed by a word already in the dictionaries separated by ','
  +
  +
(enter --exit to finish)
  +
ostekaffi,kaffi
  +
<e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e>
  +
<e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e>
  +
<e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e>
  +
Result <e><i>ostekaffi</i><par n="ep__n"/></e>
  +
Element added: (<e><i>ostekaffi</i><par n="ep__n"/></e>)
  +
  +
--------------------------------------------------------------------
  +
Enter the word you want to add to the dictionaries,
  +
followed by a word already in the dictionaries separated by ','
  +
  +
(enter --exit to finish)
  +
pianomaskin,maskin
  +
<e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e>
  +
<e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e>
  +
<e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e>
  +
Result <e><i>pianomaskin</i><par n="så__n"/></e>
  +
Element added: (<e><i>pianomaskin</i><par n="så__n"/></e>)
  +
  +
--------------------------------------------------------------------
  +
Enter the word you want to add to the dictionaries,
  +
followed by a word already in the dictionaries separated by ','
  +
  +
(enter --exit to finish)
  +
--exit
  +
Writing file new.dix
  +
</pre>
   
   
 
See discussion at http://thread.gmane.org/gmane.comp.nlp.apertium/2946/focus=2987
 
See discussion at http://thread.gmane.org/gmane.comp.nlp.apertium/2946/focus=2987
  +
or https://sourceforge.net/p/apertium/mailman/message/30540141/
  +
  +
== Limitations ==
  +
  +
* Like other dixtools methods, it parses and rewrites the dictionary, which will lead to quite large diffs if you haven't already conformed to the dixtools format.
  +
* dixtools will strip all c (comment) attributes since it doesn't know about them
  +
* it probably doesn't work on metadix
  +
* it confusingly prints some weird irrelevant lines from the .dix before the pardef it found (see above example session)
   
   

Latest revision as of 13:52, 6 October 2019

Dixtools has an enhance option to add new words to a dictionary.

How to use it:

  • java -jar path/to/apertium-dixtools.jar enhance existing_dict.xml name_new_dict.dix

There are 3 parameters

   enhance 
       the name for the tool
   existing_dict.dix
       the name of the existing dict (i.e.: apertium-es-ca.ca.dix)
   name_new_dict.dix
       the name we want the new (enhanced) dict to be saved
       ALERT: the tool simply overrides whatever file exists with that name

Then you'll be dropped into an interactive session where you can type a word you want to add, followed by a comma and then word which exists in the dictionary and has the paradigm you want to use. Example session:

$ java -jar dist/apertium-dixtools.jar enhance apertium-nno.nno.dix new.dix
Reading file 'apertium-nno.nno.dix'
'enhance' method
Enter the word you want to add to the dictionaries,
followed by a word already in the dictionaries separated by ','

(enter --exit to finish)
ostekaffi
Wrong format
Enter the word you want to add to the dictionaries,
followed by a word already in the dictionaries separated by ','

(enter --exit to finish)
ostekaffi,kaffi
<e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e>
<e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e>
<e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e>
Result  <e><i>ostekaffi</i><par n="ep__n"/></e>
Element added: (<e><i>ostekaffi</i><par n="ep__n"/></e>)

--------------------------------------------------------------------
Enter the word you want to add to the dictionaries,
followed by a word already in the dictionaries separated by ','

(enter --exit to finish)
pianomaskin,maskin
<e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e>
<e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e>
<e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e>
Result  <e><i>pianomaskin</i><par n="så__n"/></e>
Element added: (<e><i>pianomaskin</i><par n="så__n"/></e>)

--------------------------------------------------------------------
Enter the word you want to add to the dictionaries,
followed by a word already in the dictionaries separated by ','

(enter --exit to finish)
--exit
Writing file new.dix


See discussion at http://thread.gmane.org/gmane.comp.nlp.apertium/2946/focus=2987 or https://sourceforge.net/p/apertium/mailman/message/30540141/

[edit] Limitations

  • Like other dixtools methods, it parses and rewrites the dictionary, which will lead to quite large diffs if you haven't already conformed to the dixtools format.
  • dixtools will strip all c (comment) attributes since it doesn't know about them
  • it probably doesn't work on metadix
  • it confusingly prints some weird irrelevant lines from the .dix before the pardef it found (see above example session)
Personal tools