Format dictionaries

From Apertium
Revision as of 19:22, 22 December 2011 by Objectivesea (talk | contribs) (Corrected misspelling of element and clarified a potential ambiguity)
Jump to navigation Jump to search

You can use the Apertium-dixtools package) to format each <e> tag in the dictionary.

$ apertium-dixtools format-1line <dic> <dic.out>

(Note that the first character in the 1line parameter is the digit 1 (one), not the lowercase "L".)

For example, these lines:

...
<e>
  <p>
    <l>estilo<s n="n"/></l>
    <r>estil<s n="n"/></r>
  </p>
</e>
...

will be displayed in one line, instead of being indented to various levels on six lines.

...
<e><p><l>estilo<s n="n"/></l><r>estil<s n="n"/></r></p></e>
...

This single-line format can be useful if you use grep or any similar tool to process dictionaries.

Aligned formatting

You can also add two parameters, namely the positon of the

element and the position of the <r> element. Here alignP = 10 and alignR = 50:

    <!-- Conjunctions - Conjunctive adverb  -->

<e>       <p><l>antaŭ<b/>ol<s n="cnjadv"/></l>    <r>before<s n="cnjadv"/></r></p></e>
<e>       <p><l>tiel<b/>ke<s n="cnjadv"/></l>     <r>so<b/>that<s n="cnjadv"/></r></p></e>
<e>       <p><l>krom<b/>se<s n="cnjadv"/></l>     <r>unless<s n="cnjadv"/></r></p></e>
<e>       <p><l>dum<s n="cnjadv"/></l>            <r>whereas<s n="cnjadv"/></r></p></e>
<e>       <p><l>ĉar<s n="cnjadv"/></l>            <r>because<s n="cnjadv"/></r></p></e>
<e r="RL"><p><l>dum<s n="cnjadv"/></l>            <r>while<s n="cnjadv"/></r></p></e>
<e>       <p><l>ĝis<s n="cnjadv"/></l>            <r>until<s n="cnjadv"/></r></p></e>
<e>       <p><l>kiam<s n="cnjadv"/></l>           <r>when<s n="cnjadv"/></r></p></e>
<e i="yes"><p><l>kiam<s n="cnjadv"/></l>          <r>as<s n="cnjadv"/></r></p></e>
<e>       <p><l>kiel<s n="cnjadv"/></l>           <r>as<s n="cnjadv"/></r></p></e>
<e r="LR"><p><l>pro<b/>tio<b/>ke<s n="cnjadv"/></l><r>since<s n="cnjadv"/></r></p></e>

If either value is zero or negative, no alignment will be done.

Usage

Usage: dictools format-1line [alignP alignR] <input-dic> <output-dic>
       where alignP / alignR: column to align <p> and <r> entries. 0 = no indent.

Example: ' format-1line old.dix new.dix '   will give indent à la
<e><p><l>dum<s n="cnjadv"/></l><r>whereas<s n="cnjadv"/></r></p></e>

Example: ' format-1line 10 50 old.dix new.dix '   will give indent à la
<e>       <p><l>dum<s n="cnjadv"/></l>            <r>whereas<s n="cnjadv"/></r></p></e>

Example: ' format-1line 0 50 old.dix new.dix '   will give indent à la
<e><p><l>dum<s n="cnjadv"/></l>                   <r>whereas<s n="cnjadv"/></r></p></e>

Example: ' format-1line 10 0 old.dix new.dix '   will give indent à la
<e>       <p><l>dum<s n="cnjadv"/></l><r>whereas<s n="cnjadv"/></r></p></e>