Difference between revisions of "Paradigm chopper"
Jump to navigation
Jump to search
(New page: '''Paradigm chopper''' is a python script which removes redundant paradigm definitions from dictionaries, and fixes references to them. For example, if you have a dictionary thus: <pre> ...) |
|||
Line 34: | Line 34: | ||
it would remove the <code>tree__n</code> paradigm, and make all main section elements that point to this point to <code>car__n</code> instead. Currently if two paradigms are the same, it keeps the one with the shortest name. |
it would remove the <code>tree__n</code> paradigm, and make all main section elements that point to this point to <code>car__n</code> instead. Currently if two paradigms are the same, it keeps the one with the shortest name. |
||
==Example== |
|||
To use it, do: |
|||
<pre> |
|||
$ python paradigm-chopper.py <dix> > newdix |
|||
</pre> |
|||
This will print some output on stderr about what its doing (with large dictionaries it may take some time), and put the dictionary output in newdix. You will need to copy in the header (including sdef entries). After you've made this, use <code>lt-expand</code> to do a sanity check by doing: |
|||
<pre> |
|||
$ lt-expand <olddix> > olddix.exp |
|||
$ lt-expand <newdix> > newdix.exp |
|||
$ diff -Naur olddix.exp newdix.exp |
|||
</pre> |
|||
If there are any differences, please send the dictionary files, to Fran. |
|||
[[Category:Tools]] |
[[Category:Tools]] |
Revision as of 15:30, 28 September 2007
Paradigm chopper is a python script which removes redundant paradigm definitions from dictionaries, and fixes references to them. For example, if you have a dictionary thus:
<pardef n="car__n"> <e> <p> <l/> <r><s n="n"/><s n="sg"/></r> </p> </e> <e> <p> <l>s</l> <r><s n="n"/><s n="pl"/></r> </p> </e> </pardef> <pardef n="tree__n"> <e> <p> <l/> <r><s n="n"/><s n="sg"/></r> </p> </e> <e> <p> <l>s</l> <r><s n="n"/><s n="pl"/></r> </p> </e> </pardef>
it would remove the tree__n
paradigm, and make all main section elements that point to this point to car__n
instead. Currently if two paradigms are the same, it keeps the one with the shortest name.
Example
To use it, do:
$ python paradigm-chopper.py <dix> > newdix
This will print some output on stderr about what its doing (with large dictionaries it may take some time), and put the dictionary output in newdix. You will need to copy in the header (including sdef entries). After you've made this, use lt-expand
to do a sanity check by doing:
$ lt-expand <olddix> > olddix.exp $ lt-expand <newdix> > newdix.exp $ diff -Naur olddix.exp newdix.exp
If there are any differences, please send the dictionary files, to Fran.