Difference between revisions of "Paradigm chopper"
Jump to navigation
Jump to search
Line 3: | Line 3: | ||
<pre> |
<pre> |
||
<pardef n="car__n"> |
<pardef n="car__n"> |
||
⚫ | |||
<e> |
|||
⚫ | |||
<p> |
|||
<l/> |
|||
⚫ | |||
</p> |
|||
</e> |
|||
<e> |
|||
<p> |
|||
<l>s</l> |
|||
⚫ | |||
</p> |
|||
</e> |
|||
</pardef> |
</pardef> |
||
<pardef n="tree__n"> |
<pardef n="tree__n"> |
||
⚫ | |||
<e> |
|||
⚫ | |||
<p> |
|||
<l/> |
|||
⚫ | |||
</p> |
|||
</e> |
|||
<e> |
|||
<p> |
|||
<l>s</l> |
|||
⚫ | |||
</p> |
|||
</e> |
|||
</pardef> |
</pardef> |
||
</pre> |
</pre> |
Revision as of 11:41, 26 March 2011
Paradigm chopper is a python script which removes redundant paradigm definitions from dictionaries, and fixes references to them. For example, if you have a dictionary thus:
<pardef n="car__n"> <e><p><l/><r><s n="n"/><s n="sg"/></r></p></e> <e><p><l>s</l><r><s n="n"/><s n="pl"/></r></p></e> </pardef> <pardef n="tree__n"> <e><p><l/><r><s n="n"/><s n="sg"/></r></p></e> <e><p><l>s</l><r><s n="n"/><s n="pl"/></r></p></e> </pardef>
it would remove the tree__n
paradigm, and make all main section elements that point to this point to car__n
instead. Currently if two paradigms are the same, it keeps the one with the shortest name.
Example
To use it, do:
$ python paradigm-chopper.py <dix> > newdix
This will print some output on stderr about what its doing (with large dictionaries it may take some time), and put the dictionary output in newdix. You will need to copy in the header (including sdef entries). After you've made this, use lt-expand
to do a sanity check by doing:
$ lt-expand <olddix> > olddix.exp $ lt-expand <newdix> > newdix.exp $ diff -Naur olddix.exp newdix.exp
If there are any differences, please send the dictionary files, to Fran.