Dixtools: Enhance
Jump to navigation
Jump to search
Dixtools has an enhance
option to add new words to a dictionary.
How to use it:
- java -jar path/to/apertium-dixtools.jar enhance existing_dict.xml name_new_dict.dix
There are 3 parameters
enhance the name for the tool existing_dict.dix the name of the existing dict (i.e.: apertium-es-ca.ca.dix) name_new_dict.dix the name we want the new (enhanced) dict to be saved ALERT: the tool simply overrides whatever file exists with that name
Then you'll be dropped into an interactive session where you can type a word you want to add, followed by a comma and then word which exists in the dictionary and has the paradigm you want to use. Example session:
$ java -jar dist/apertium-dixtools.jar enhance apertium-nno.nno.dix new.dix Reading file 'apertium-nno.nno.dix' 'enhance' method Enter the word you want to add to the dictionaries, followed by a word already in the dictionaries separated by ',' (enter --exit to finish) ostekaffi Wrong format Enter the word you want to add to the dictionaries, followed by a word already in the dictionaries separated by ',' (enter --exit to finish) ostekaffi,kaffi <e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e> <e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e> <e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e> Result <e><i>ostekaffi</i><par n="ep__n"/></e> Element added: (<e><i>ostekaffi</i><par n="ep__n"/></e>) -------------------------------------------------------------------- Enter the word you want to add to the dictionaries, followed by a word already in the dictionaries separated by ',' (enter --exit to finish) pianomaskin,maskin <e>[IVXLCDM\-]+<l></l><r><det><qnt><un><pl></r></e> <e><i>AP_PAIR_VERSION</i><l></l><r><adv></r></e> <e r="LR"><i>AP_LANG_VERSION</i><l></l><r>@APERTIUM_AUTO_VERSION@<adv></r></e> Result <e><i>pianomaskin</i><par n="så__n"/></e> Element added: (<e><i>pianomaskin</i><par n="så__n"/></e>) -------------------------------------------------------------------- Enter the word you want to add to the dictionaries, followed by a word already in the dictionaries separated by ',' (enter --exit to finish) --exit Writing file new.dix
See discussion at http://thread.gmane.org/gmane.comp.nlp.apertium/2946/focus=2987
or https://sourceforge.net/p/apertium/mailman/message/30540141/
Limitations[edit]
- Like other dixtools methods, it parses and rewrites the dictionary, which will lead to quite large diffs if you haven't already conformed to the dixtools format.
- dixtools will strip all c (comment) attributes since it doesn't know about them
- it probably doesn't work on metadix
- it confusingly prints some weird irrelevant lines from the .dix before the pardef it found (see above example session)