Difference between revisions of "Talk:Metadix"

From Apertium
Jump to navigation Jump to search
(an example of Sergio's proposed extended metadix (as far as I understood it))
 
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
==Proposal for several prm's==
 
At the Apertium EGM at FreeRBMT, Sergio proposed to integrate an enhanced version of metadix into the compiler; as I understood it, the new syntax will look similar to this:
 
At the Apertium EGM at FreeRBMT, Sergio proposed to integrate an enhanced version of metadix into the compiler; as I understood it, the new syntax will look similar to this:
   
Line 62: Line 63:
 
</pardef>
 
</pardef>
 
</pre>
 
</pre>
  +
  +
  +
===Kurdish verb example===
  +
This'd be great to have for Kurdish, where we have
  +
<pre>
  +
<!-- gotin; got; bej -->
  +
<e lm="gotin"><p><l>got</l><r>gotin</r></p><par n="kir/__vblex_tv"/></e>
  +
<e lm="gotin"><p><l>dibej</l><r>gotin</r></p><par n="dik/e__vblex_tv"/></e>
  +
<e lm="gotin"><p><l>bibej</l><r>gotin</r></p><par n="bik/e__vblex_tv"/></e>
  +
<e lm="gotin"><p><l>negot</l><r>gotin</r></p><par n="nekir/__vblex_tv"/></e>
  +
<e lm="gotin"><p><l>nabej</l><r>gotin</r></p><par n="nak/e__vblex_tv"/></e>
  +
<e lm="gotin"><p><l>nebej</l><r>gotin</r></p><par n="nek/e__vblex_tv"/></e>
  +
  +
<!-- parastin; ; parast; parêz -->
  +
<e lm="parastin"><p><l>parast</l><r>parastin</r></p><par n="kir/__vblex_tv"/></e>
  +
<e lm="parastin"><p><l>diparêz</l><r>parastin</r></p><par n="dik/e__vblex_tv"/></e>
  +
<e lm="parastin"><p><l>biparêz</l><r>parastin</r></p><par n="bik/e__vblex_tv"/></e>
  +
<e lm="parastin"><p><l>neparast</l><r>parastin</r></p><par n="nekir/__vblex_tv"/></e>
  +
<e lm="parastin"><p><l>naparêz</l><r>parastin</r></p><par n="nak/e__vblex_tv"/></e>
  +
<e lm="parastin"><p><l>neparêz</l><r>parastin</r></p><par n="nek/e__vblex_tv"/></e>
  +
</pre>
  +
  +
The above could instead be as simple as
  +
<pre>
  +
<e lm="gotin"><par n="[got]in_di[bej]__vblex_tv" prm="got" prm="bej"/></e>
  +
<e lm="parastin"><par n="[got]in_di[bej]__vblex_tv" prm="parast" prm="parêz"/></e>
  +
</pre>
  +
where the pardef has
  +
<pre>
  +
<pardef n="[got]in_di[bej]__vblex_tv" nprm="2"/>
  +
<e><p><l><prm n="1"/></l><r><prm n="1"/>in</r></p><par n="kir/__vblex_tv"/></e>
  +
<e><p><l>di<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="dik/e__vblex_tv"/></e>
  +
<e><p><l>bi<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="bik/e__vblex_tv"/></e>
  +
<e><p><l>ne<prm n="1"/></l><r><prm n="1"/>in</r></p><par n="nekir/__vblex_tv"/></e>
  +
<e><p><l>na<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nak/e__vblex_tv"/></e>
  +
<e><p><l>ne<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nek/e__vblex_tv"/></e>
  +
</pardef>
  +
</pre>
  +
  +
===Identical attributes invalid?===
  +
To be valid XML, wouldn't it have to be prm1="foo", prm2="bar" etc?
  +
(We could simply put prm's up to some high-enough-but-finite number in the DTD.)
  +
  +
  +
The alternative, more XML-like, would be to allow child elements to the par, ie.
  +
  +
<pre>
  +
<e lm="gotin"><par n="[got]in_di[bej]__vblex_tv"><prm="got"/><prm="bej"/></par></e>
  +
<e lm="parastin"><par n="[got]in_di[bej]__vblex_tv"><prm="parast"/><prm="parêz"/></par></e>
  +
</pre>
  +
– that might actually be simpler all round?

Latest revision as of 08:20, 25 April 2016

Proposal for several prm's[edit]

At the Apertium EGM at FreeRBMT, Sergio proposed to integrate an enhanced version of metadix into the compiler; as I understood it, the new syntax will look similar to this:

  <par n="v/[a][ter]__n">
    <e>
      <p>
        <l><prm n="1"/><prm n="3"/></l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="2"/><prm n="3"/></l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="pl"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="1"/><prm n="3"/>s</l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="gen"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="2"/><prm n="3"/>n</l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="dat"/></r>
      </p>
    </e>
  </pardef>

  <e lm="vater"><i>v</i><par n="v/[a][ter]__n" prm="a" prm="ä" prm="ter"/></e>

That is; <prm> elements will be numbered (so they may be freely placed anywhere within the pardef), while prm attributes will be specified with an implicit order; the above paradigm with the specified parameters will expand to this:

  <par n="v/[a][ter]__n">
    <e>
      <p>
        <l>ater</l>
        <r>ater<s n="n"/><s n="sg"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>äter</l>
        <r>ater<s n="n"/><s n="pl"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>aters</l>
        <r>ater<s n="n"/><s n="sg"/><s n="gen"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>ätern</l>
        <r>ater<s n="n"/><s n="sg"/><s n="dat"/></r>
      </p>
    </e>
  </pardef>


Kurdish verb example[edit]

This'd be great to have for Kurdish, where we have

    <!-- gotin; got; bej -->
    <e lm="gotin"><p><l>got</l><r>gotin</r></p><par n="kir/__vblex_tv"/></e>
    <e lm="gotin"><p><l>dibej</l><r>gotin</r></p><par n="dik/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>bibej</l><r>gotin</r></p><par n="bik/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>negot</l><r>gotin</r></p><par n="nekir/__vblex_tv"/></e>
    <e lm="gotin"><p><l>nabej</l><r>gotin</r></p><par n="nak/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>nebej</l><r>gotin</r></p><par n="nek/e__vblex_tv"/></e>

    <!-- parastin; ; parast; parêz -->
    <e lm="parastin"><p><l>parast</l><r>parastin</r></p><par n="kir/__vblex_tv"/></e>
    <e lm="parastin"><p><l>diparêz</l><r>parastin</r></p><par n="dik/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>biparêz</l><r>parastin</r></p><par n="bik/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>neparast</l><r>parastin</r></p><par n="nekir/__vblex_tv"/></e>
    <e lm="parastin"><p><l>naparêz</l><r>parastin</r></p><par n="nak/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>neparêz</l><r>parastin</r></p><par n="nek/e__vblex_tv"/></e>

The above could instead be as simple as

    <e lm="gotin"><par n="[got]in_di[bej]__vblex_tv" prm="got" prm="bej"/></e>
    <e lm="parastin"><par n="[got]in_di[bej]__vblex_tv" prm="parast" prm="parêz"/></e>

where the pardef has

  <pardef n="[got]in_di[bej]__vblex_tv" nprm="2"/>
    <e><p><l><prm n="1"/></l><r><prm n="1"/>in</r></p><par n="kir/__vblex_tv"/></e>
    <e><p><l>di<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="dik/e__vblex_tv"/></e>
    <e><p><l>bi<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="bik/e__vblex_tv"/></e>
    <e><p><l>ne<prm n="1"/></l><r><prm n="1"/>in</r></p><par n="nekir/__vblex_tv"/></e>
    <e><p><l>na<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nak/e__vblex_tv"/></e>
    <e><p><l>ne<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nek/e__vblex_tv"/></e>
  </pardef>

Identical attributes invalid?[edit]

To be valid XML, wouldn't it have to be prm1="foo", prm2="bar" etc? (We could simply put prm's up to some high-enough-but-finite number in the DTD.)


The alternative, more XML-like, would be to allow child elements to the par, ie.

    <e lm="gotin"><par n="[got]in_di[bej]__vblex_tv"><prm="got"/><prm="bej"/></par></e>
    <e lm="parastin"><par n="[got]in_di[bej]__vblex_tv"><prm="parast"/><prm="parêz"/></par></e>

– that might actually be simpler all round?