Talk:Metadix

From Apertium
Jump to navigation Jump to search

Proposal for several prm's

At the Apertium EGM at FreeRBMT, Sergio proposed to integrate an enhanced version of metadix into the compiler; as I understood it, the new syntax will look similar to this:

  <par n="v/[a][ter]__n">
    <e>
      <p>
        <l><prm n="1"/><prm n="3"/></l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="2"/><prm n="3"/></l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="pl"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="1"/><prm n="3"/>s</l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="gen"/></r>
      </p>
    </e>
    <e>
      <p>
        <l><prm n="2"/><prm n="3"/>n</l>
        <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="dat"/></r>
      </p>
    </e>
  </pardef>

  <e lm="vater"><i>v</i><par n="v/[a][ter]__n" prm="a" prm="ä" prm="ter"/></e>

That is; <prm> elements will be numbered (so they may be freely placed anywhere within the pardef), while prm attributes will be specified with an implicit order; the above paradigm with the specified parameters will expand to this:

  <par n="v/[a][ter]__n">
    <e>
      <p>
        <l>ater</l>
        <r>ater<s n="n"/><s n="sg"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>äter</l>
        <r>ater<s n="n"/><s n="pl"/><s n="nom"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>aters</l>
        <r>ater<s n="n"/><s n="sg"/><s n="gen"/></r>
      </p>
    </e>
    <e>
      <p>
        <l>ätern</l>
        <r>ater<s n="n"/><s n="sg"/><s n="dat"/></r>
      </p>
    </e>
  </pardef>


Kurdish verb example

This'd be great to have for Kurdish, where we have

    <!-- gotin; got; bej -->
    <e lm="gotin"><p><l>got</l><r>gotin</r></p><par n="kir/__vblex_tv"/></e>
    <e lm="gotin"><p><l>dibej</l><r>gotin</r></p><par n="dik/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>bibej</l><r>gotin</r></p><par n="bik/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>negot</l><r>gotin</r></p><par n="nekir/__vblex_tv"/></e>
    <e lm="gotin"><p><l>nabej</l><r>gotin</r></p><par n="nak/e__vblex_tv"/></e>
    <e lm="gotin"><p><l>nebej</l><r>gotin</r></p><par n="nek/e__vblex_tv"/></e>

    <!-- parastin; ; parast; parêz -->
    <e lm="parastin"><p><l>parast</l><r>parastin</r></p><par n="kir/__vblex_tv"/></e>
    <e lm="parastin"><p><l>diparêz</l><r>parastin</r></p><par n="dik/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>biparêz</l><r>parastin</r></p><par n="bik/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>neparast</l><r>parastin</r></p><par n="nekir/__vblex_tv"/></e>
    <e lm="parastin"><p><l>naparêz</l><r>parastin</r></p><par n="nak/e__vblex_tv"/></e>
    <e lm="parastin"><p><l>neparêz</l><r>parastin</r></p><par n="nek/e__vblex_tv"/></e>

The above could instead be as simple as

    <e lm="gotin"><par n="[got]in_di[bej]__vblex_tv" prm="got" prm="bej"/></e>
    <e lm="parastin"><par n="[got]in_di[bej]__vblex_tv" prm="parast" prm="parêz"/></e>

where the pardef has

  <pardef n="[got]in_di[bej]__vblex_tv" nprm="2"/>
    <e><p><l><prm n="1"/></l><r><prm n="1"/>in</r></p><par n="kir/__vblex_tv"/></e>
    <e><p><l>di<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="dik/e__vblex_tv"/></e>
    <e><p><l>bi<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="bik/e__vblex_tv"/></e>
    <e><p><l>ne<prm n="1"/></l><r><prm n="1"/>in</r></p><par n="nekir/__vblex_tv"/></e>
    <e><p><l>na<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nak/e__vblex_tv"/></e>
    <e><p><l>ne<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nek/e__vblex_tv"/></e>
  </pardef>

Identical attributes invalid?

To be valid XML, wouldn't it have to be prm1="foo", prm2="bar" etc? (We could simply put prm's up to some high-enough-but-finite number in the DTD.)


The alternative, more XML-like, would be to allow child elements to the par, ie.

    <e lm="gotin"><par n="[got]in_di[bej]__vblex_tv"><prm="got"/><prm="bej"/></par></e>
    <e lm="parastin"><par n="[got]in_di[bej]__vblex_tv"><prm="parast"/><prm="parêz"/></par></e>

– that might actually be simpler all round?