Talk:Metadix
Proposal for several prm's
At the Apertium EGM at FreeRBMT, Sergio proposed to integrate an enhanced version of metadix into the compiler; as I understood it, the new syntax will look similar to this:
<par n="v/[a][ter]__n"> <e> <p> <l><prm n="1"/><prm n="3"/></l> <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="nom"/></r> </p> </e> <e> <p> <l><prm n="2"/><prm n="3"/></l> <r><prm n="1"/><prm n="3"/><s n="n"/><s n="pl"/><s n="nom"/></r> </p> </e> <e> <p> <l><prm n="1"/><prm n="3"/>s</l> <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="gen"/></r> </p> </e> <e> <p> <l><prm n="2"/><prm n="3"/>n</l> <r><prm n="1"/><prm n="3"/><s n="n"/><s n="sg"/><s n="dat"/></r> </p> </e> </pardef> <e lm="vater"><i>v</i><par n="v/[a][ter]__n" prm="a" prm="ä" prm="ter"/></e>
That is; <prm> elements will be numbered (so they may be freely placed anywhere within the pardef), while prm attributes will be specified with an implicit order; the above paradigm with the specified parameters will expand to this:
<par n="v/[a][ter]__n"> <e> <p> <l>ater</l> <r>ater<s n="n"/><s n="sg"/><s n="nom"/></r> </p> </e> <e> <p> <l>äter</l> <r>ater<s n="n"/><s n="pl"/><s n="nom"/></r> </p> </e> <e> <p> <l>aters</l> <r>ater<s n="n"/><s n="sg"/><s n="gen"/></r> </p> </e> <e> <p> <l>ätern</l> <r>ater<s n="n"/><s n="sg"/><s n="dat"/></r> </p> </e> </pardef>
This'd be great to have for e.g. Kurdish, where we have e..g
<!-- parastin; ; parast; parêz --> <e lm="parastin"><p><l>parast</l><r>parastin</r></p><par n="kir/__vblex_tv"/></e> <e lm="parastin"><p><l>diparêz</l><r>parastin</r></p><par n="dik/e__vblex_tv"/></e> <e lm="parastin"><p><l>biparêz</l><r>parastin</r></p><par n="bik/e__vblex_tv"/></e> <e lm="parastin"><p><l>neparast</l><r>parastin</r></p><par n="nekir/__vblex_tv"/></e> <e lm="parastin"><p><l>naparêz</l><r>parastin</r></p><par n="nak/e__vblex_tv"/></e> <e lm="parastin"><p><l>neparêz</l><r>parastin</r></p><par n="nek/e__vblex_tv"/></e>
The above could instead be as simple as
<e lm="parastin"><par n="[got]in_di[bej]__vblex_tv" prm="parast" prm="parêz"/></e>
where the pardef has
<pardef n="[got]in_di[bej]__vblex_tv" nprm="2"/> <e><p><l><prm n="1"/></l><r><prm n="1"/>in</r></p><par n="kir/__vblex_tv"/></e> <e><p><l>di<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="dik/e__vblex_tv"/></e> <e><p><l>bi<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="bik/e__vblex_tv"/></e> <e><p><l>ne<prm n="1"/></l><r><prm n="1"/>in</r></p><par n="nekir/__vblex_tv"/></e> <e><p><l>na<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nak/e__vblex_tv"/></e> <e><p><l>ne<prm n="2"/></l><r><prm n="1"/>in</r></p><par n="nek/e__vblex_tv"/></e> </pardef>
Identical attributes invalid?
To be valid XML, wouldn't it have to be prm1="foo", prm2="bar" etc? (We could simply put prm's up to some high-enough-but-finite number in the DTD.)
The alternative, more XML-like, would be to allow child elements to the par, ie.
<e lm="parastin"><par n="[got]in_di[bej]__vblex_tv"><prm="parast"/><prm="parêz"/></par></e>
– that might actually be simpler all round?