Difference between revisions of "Talk:Multiwords"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
  +
==Yet another==
  +
  +
<pre>
  +
<mw n="dirección general">
  +
<lu lemma="dirección" tags="n.*" head/>
  +
<lu lemma="general" tags="adj.mf.*"/>
  +
</mw>
  +
  +
<mw n="zračna luka">
  +
<lu lemma="zračna" tags="adj.*"/>
  +
<lu lemma="luka" tags="n.*" head/>
  +
</mw>
  +
</pre>
  +
  +
Tags from the lu marked "head" are preserved, where tags for others are removed. So the output would be:
  +
  +
<pre>
  +
^dirección<n><f><sg>$ ^general<adj><mf><sg>$ → ^dirección general<n><f><sg>$
  +
</pre>
  +
  +
While generation would look like:
  +
  +
<pre>
  +
^dirección general<n><f><pl>$ → ^dirección<n><f><pl>$ ^general<adj><mf><pl>$
  +
</pre>
  +
  +
Note how the tags marked in <code>tags</code> are preserved, where the rest are copied.
  +
 
==Another option==
 
==Another option==
 
<pre>
 
<pre>

Revision as of 10:26, 21 April 2008

Yet another

<mw n="dirección general">
  <lu lemma="dirección" tags="n.*" head/>
  <lu lemma="general" tags="adj.mf.*"/>
</mw>

<mw n="zračna luka">
  <lu lemma="zračna" tags="adj.*"/>
  <lu lemma="luka" tags="n.*" head/>
</mw>

Tags from the lu marked "head" are preserved, where tags for others are removed. So the output would be:

^dirección<n><f><sg>$ ^general<adj><mf><sg>$ → ^dirección general<n><f><sg>$

While generation would look like:

^dirección general<n><f><pl>$ → ^dirección<n><f><pl>$ ^general<adj><mf><pl>$

Note how the tags marked in tags are preserved, where the rest are copied.

Another option

<spectie> jimregan, you might be able to just do it with a dictionary
<jimregan> I'm listening
<spectie> ok
<spectie> so imagine:
<jimregan> (err... well, reading :)
<spectie> ah no
<spectie> because you'd need to enumerate the tags
<spectie> although, that might not be so difficult if we have lt-expand
<spectie> ok
<spectie> here:
<spectie> <e>
<spectie>   <p>
<spectie>     <l>strajk<s n="n"/><s n="m"/><s n="sg"/><s n="nom"/><b/>włoski<s n="adj"/><s n="m"/><s n="sg"/><s n="nom"/></l>
<spectie>     <r>strajk<b/>włoski<s n="n"/><s n="m"/><s n="sg"/><s n="nom"/></r>
<spectie>   </p>
<spectie> </e>
<spectie>  
<spectie> then you just run it through the lt-proc again with a special mode set
<spectie> you'd run that before the transfer
<spectie> and it would work for both analysis and generation

And another


<jimregan> something like this
<jimregan> <multiword n="noun-adj_np.top_f">
<jimregan>  <replacements>
<jimregan>   <replace><l><s n="adj"/></l><r><s n="np"/><s n="top"/></r></replace>
<jimregan>  </replacements>
<jimregan>  <join>
<jimregan>   <i><s n="f"/></i>
<jimregan>   <i><s n="nom"/></i>
<jimregan>   <i><s n="gen"/></i>
<jimregan>   <i><s n="acc"/></i>
<jimregan>   <i><s n="dat"/></i>
<jimregan>   <i><s n="loc"/></i>
<jimregan>   <i><s n="ins"/></i>
<jimregan>   <i><s n="voc"/></i>
<jimregan>  </join>
<jimregan>  <restrict>
<jimregan>   <i><s n="f"/></i>
<jimregan>   <i><s n="sg"/></i>
<jimregan>  </restrict>
<jimregan> </multiword>
<jimregan> <multiword n="noun-adj_noun">
<jimregan>  <replacements>
<jimregan>   <replace><l><s n="adj"/></l><r><s n="n"/></r></replace>
<jimregan>   <replace><l><s n="m"/></l><r><s n="m3"/></r></replace>
<jimregan>  </replacements>
<jimregan>  <join>
<jimregan>   <i><s n="nom"/></i>
<jimregan>   <i><s n="gen"/></i>
<jimregan>   <i><s n="acc"/></i>
<jimregan>   <i><s n="dat"/></i>
<jimregan>   <i><s n="loc"/></i>
<jimregan>   <i><s n="ins"/></i>
<jimregan>   <i><s n="voc"/></i>
<jimregan>  </join>
<jimregan>  <restrict>
<jimregan>   <i><s n="sg"/></i>
<jimregan>  </restrict>
<jimregan> </multiword>
<jimregan> <mw lm="Wielka Brytania" type="noun-adj_np.top_f">
<jimregan>  <i>Wiel</i><par n="wiel/ki__adj"/>
<jimregan>  <i><b/></i>
<jimregan>  <i>Brytani</i><par n="Francj/a__np"/>
<jimregan> </mw>
<jimregan> <mw lm="strajk włoski" type="noun-adj_noun">
<jimregan>  <i>strajk</i><par n="maluch/__n"/>
<jimregan>  <i><b/></i>
<jimregan>  <i>włos</i><par n="pols/ki__adj"/>
<jimregan> </mw>
<spectie> hmm
<spectie> whats the "join" thing ?
<jimregan> oops. wasn't meant to have '<i><s n="f"/></i>' in the '<join>' of the first, just in <restrict>
<jimregan> where that tag exists in each parameter, use that as output
<spectie> where would this be called ?
<spectie> after analysis ?
<jimregan> possibly, but for the moment I'm thinking of adding it as a generated subsection of the analyser
<spectie> what do you reckon to my idea ?
<jimregan> each 'mw' would be expanded to an '<e>'
<jimregan> the problem is that I don't want to keep the adjective pardefs as simple as possible
<spectie> you don't ?
<jimregan> 'strajk wloski' would have to be 'm3', not 'm'
<spectie> aha
<jimregan> but in most cases it doesn't make sense to have the adjectives consider masculine gender subtypes separately
<spectie> ah ok
<spectie> i was thinking of putting in mine after tagging
<jimregan> so I want to have a stylesheet replace 'adj.m' with 'n.m3' in the strajk wloski case
<spectie> hmm
<spectie> it would work
<spectie> you could make the "<join>" thing a paradigm
<spectie> e.g. <pardef n="cases"><e><i><s n="nom"/></i></e> ... </pardef>     <join><par n="cases"/></join>
<jimregan> aha
<jimregan> yes