Difference between revisions of "Separable verbs"

From Apertium
Jump to navigation Jump to search
m (Reverted edits by 122.252.226.40 (Talk); changed back to last version by Francis Tyers)
Line 1: Line 1:
[http://orel-na-vole.freehostia.com/blog/nike-air96/ nike air zoom miler] [http://naughtygirl92.ifrance.com/data/officejet10.html officejet 6110] [http://rasa18.ifrance.com/bpda-it/ bpda it] [http://orel-na-vole.freehostia.com/blog/creative-audigy5/ creative audigy 4] [http://orel-na-vole.freehostia.com/blog/guam-kg/ guam 1kg] [http://rasa18.ifrance.com/hegyalja/ hegyalja] [http://naughtygirl92.ifrance.com/data/dvd-recorder127.html dvd recorder disk] [http://orel-na-vole.freehostia.com/blog/siemens-13/ siemens 355 cordles] [http://rasa18.ifrance.com/mapam/ mapam] [http://orel-na-vole.freehostia.com/blog/ub-homegrown/ ub40 homegrown] [http://rasa18.ifrance.com/big-runca/ big runca] [http://nnnslogan.freehostia.com/hsr-nokia.htm hs2r nokia] [http://orel-na-vole.freehostia.com/blog/rupee/ rupee] [http://orel-na-vole.freehostia.com/blog/ocean-colour/ ocean colour scene filmed from the front row] [http://naughtygirl92.ifrance.com/data/accordo-canzoni.html accordo canzoni] [http://naughtygirl92.ifrance.com/data/televisori27.html televisori 27] [http://orel-na-vole.freehostia.com/blog/elenco-magazzini/ elenco magazzini pannelli polistirolo] [http://orel-na-vole.freehostia.com/blog/orgasmos-de/ orgasmos de teens] [http://rasa18.ifrance.com/tiziano-ferro59/ tiziano ferro non me lo] [http://naughtygirl92.ifrance.com/data/xtnd.html xtnd] [http://nnnslogan.freehostia.com/tv-tuner24.htm tv tuner schede acquisizione e tv] [http://rasa18.ifrance.com/apple-shuffle1/ apple shuffle 512] [http://orel-na-vole.freehostia.com/blog/legge-n16/ legge n 488 92] [http://orel-na-vole.freehostia.com/blog/xsat-tpscrypt/ xsat tpscrypt] [http://naughtygirl92.ifrance.com/data/lo-schiavo1.html lo schiavo del passato] [http://rasa18.ifrance.com/lcd-vm/ lcd 92vm] [http://nnnslogan.freehostia.com/storia-citta.htm storia citta di prato] [http://nnnslogan.freehostia.com/mago.htm mago] [http://naughtygirl92.ifrance.com/data/emilceramica.html emilceramica] [http://orel-na-vole.freehostia.com/blog/cheb-khaled2/ cheb khaled aicha] [http://orel-na-vole.freehostia.com/blog/cotituzione-europe/ cotituzione europe] [http://rasa18.ifrance.com/codice-omega/ codice omega] [http://orel-na-vole.freehostia.com/blog/masterizzatori-dvd68/ masterizzatori dvd ram] [http://naughtygirl92.ifrance.com/data/maiale-assatanate.html maiale assatanate] [http://orel-na-vole.freehostia.com/blog/nokia119/ nokia6600] [http://naughtygirl92.ifrance.com/data/cartone-disney.html cartone disney] [http://orel-na-vole.freehostia.com/blog/master-card/ master card logo] [http://nnnslogan.freehostia.com/www-castro.htm www castro marina it] [http://nnnslogan.freehostia.com/metropolis-part1.htm metropolis part 1] [http://nnnslogan.freehostia.com/portatili-athlon.htm portatili athlon] [http://nnnslogan.freehostia.com/penna-usb10.htm penna usb mp3 1gb] [http://rasa18.ifrance.com/elodie-frege6/ elodie frege nuda] [http://rasa18.ifrance.com/zero-divide1/ zero divide 2] [http://rasa18.ifrance.com/tu-cosa1/ tu cosa fai stasera] [http://nnnslogan.freehostia.com/epson-cxn.htm epson cx11n] [http://orel-na-vole.freehostia.com/blog/nino-d40/ nino d angelo a mia nonna] [http://nnnslogan.freehostia.com/genova-gay.htm genova gay] [http://nnnslogan.freehostia.com/hotel-croazia.htm hotel croazia mare] [http://rasa18.ifrance.com/eaxsex-td/ eax300sex td] [http://nnnslogan.freehostia.com/moondance-buble1.htm moondance buble lyrics] [http://nnnslogan.freehostia.com/blof-countingcrows.htm blof countingcrows] [http://naughtygirl92.ifrance.com/data/adobe-creative16.html adobe creative suite cs] [http://naughtygirl92.ifrance.com/data/custodia-kodak1.html custodia kodak dx6490] [http://orel-na-vole.freehostia.com/blog/imagini-in/ imagini in movimento] [http://orel-na-vole.freehostia.com/blog/dicaprio-nudo/ dicaprio nudo] [http://rasa18.ifrance.com/case-sardegna/ case sardegna] [http://nnnslogan.freehostia.com/due-matti.htm due matti al servizio dello stato] [http://naughtygirl92.ifrance.com/data/piastra-radianti.html piastra radianti] [http://rasa18.ifrance.com/www-video15/ www video porno] [http://rasa18.ifrance.com/barbie-e2/ barbie e il lago dei cigni] [http://rasa18.ifrance.com/upadance/ upadance] [http://rasa18.ifrance.com/azienda-speciale/ azienda speciale asm] [http://nnnslogan.freehostia.com/ps-mod.htm ps2 mod chip] [http://naughtygirl92.ifrance.com/data/nuova-volvo9.html nuova volvo diesel auto nuove] [http://naughtygirl92.ifrance.com/data/strade-di2.html strade di taranto] [http://rasa18.ifrance.com/peugeot-85/ peugeot 106 1998 diesel] [http://naughtygirl92.ifrance.com/data/microsoft-com1.html microsoft com italy] [http://orel-na-vole.freehostia.com/blog/ofaolain-sean/ ofaolain sean] [http://rasa18.ifrance.com/j-krantz/ j krantz] [http://rasa18.ifrance.com/helicopter-attak/ helicopter attak] [http://orel-na-vole.freehostia.com/blog/sound-blaster60/ sound blaster live 24 bit esterna] [http://naughtygirl92.ifrance.com/data/garfild6.html garfild2] [http://naughtygirl92.ifrance.com/data/albertina-scomparsa.html albertina scomparsa] [http://orel-na-vole.freehostia.com/blog/fiorista/ fiorista] [http://nnnslogan.freehostia.com/case-di7.htm case di cura] [http://rasa18.ifrance.com/linea-scuola1/ linea scuola] [http://nnnslogan.freehostia.com/frasi-buongiorno.htm frasi buongiorno] [http://nnnslogan.freehostia.com/pc-amd15.htm pc amd sempron 3000] [http://naughtygirl92.ifrance.com/data/cartongesso-fresa.html cartongesso fresa] [http://naughtygirl92.ifrance.com/data/foto-di436.html foto di croop circle] [http://naughtygirl92.ifrance.com/data/video-grande2.html video grande fratello tedesco] [http://nnnslogan.freehostia.com/http-www113.htm http www che it] [http://orel-na-vole.freehostia.com/blog/sere-nere4/ sere nere di tiziano ferro] [http://nnnslogan.freehostia.com/lavatrici-ariston4.htm lavatrici ariston] [http://naughtygirl92.ifrance.com/data/pen-usb4.html pen usb mp3 gb] [http://orel-na-vole.freehostia.com/blog/impreza-sti/ impreza sti 2002] [http://rasa18.ifrance.com/i-breathe/ i breathe easy] [http://nnnslogan.freehostia.com/merlos-space.htm merlos space] [http://naughtygirl92.ifrance.com/data/prenom-carmen.html prenom carmen] [http://rasa18.ifrance.com/video-scaricare/ video scaricare] [http://rasa18.ifrance.com/final-fantasy31/ final fantasy iii su nintendo ds v e vi su game boy advance] [http://nnnslogan.freehostia.com/pecorine.htm pecorine] [http://rasa18.ifrance.com/videogiochi-game/ videogiochi game boy] [http://orel-na-vole.freehostia.com/blog/vives-en/ vives en mi] [http://naughtygirl92.ifrance.com/data/penna-usb7.html penna usb mp3] [http://naughtygirl92.ifrance.com/data/la-mia55.html la mia seconda storia vera] [http://nnnslogan.freehostia.com/masterizzatore-dvd104.htm masterizzatore dvd notebook pioneer] [http://orel-na-vole.freehostia.com/blog/guida-ai1/ guida ai driver] [http://nnnslogan.freehostia.com/fastwebnet.htm fastwebnet] [http://naughtygirl92.ifrance.com/data/editor-midi.html editor midi]
 
 
{{TOCD}}
 
{{TOCD}}
 
Apertium may have some problems when dealing with '''separable verbs'''. Separable verbs are verbs that are formed with a verb stem, and a particle. For futher information see Wikipedia article [http://en.wikipedia.org/wiki/Separable_verb here]. These exist in most Germanic languages, and also languages such as Hungarian.
 
Apertium may have some problems when dealing with '''separable verbs'''. Separable verbs are verbs that are formed with a verb stem, and a particle. For futher information see Wikipedia article [http://en.wikipedia.org/wiki/Separable_verb here]. These exist in most Germanic languages, and also languages such as Hungarian.
Line 35: Line 34:
 
However, in an example such as above, where the "aan" portion is moved after the noun phrase in the sentence, we cannot analyse this, we instead rely on the fact that "kondig" does not have a meaning without "aan". Unfortunately this is not always the case...
 
However, in an example such as above, where the "aan" portion is moved after the noun phrase in the sentence, we cannot analyse this, we instead rely on the fact that "kondig" does not have a meaning without "aan". Unfortunately this is not always the case...
   
Take for example, the verbs "onderdruk" and "druk". The former means "to suppress", the latter means "to press" or "to squeeze". So when we try and translate "onderdruk" → "suppress", instead we get "press under", or "squeeze under". This is not a good translation in this instance (although in many cases it can work, viz. "terugkry" → "kry terug" → "get back").
+
Take for example, the verbs "onderdruk" and "druk". The former means "to suppress", the latter means "to press" or "to squeeze". So when we try and translate "onderdruk" "suppress", instead we get "press under", or "squeeze under". This is not a good translation in this instance (although in many cases it can work, viz. "terugkry" "kry terug" "get back").
   
 
Furthermore, we cannot define "druk" as "suppress" and simply let the particle take care of itself, because "druk" has another meaning.
 
Furthermore, we cannot define "druk" as "suppress" and simply let the particle take care of itself, because "druk" has another meaning.
Line 86: Line 85:
 
<pre>
 
<pre>
 
Sterrekundiges kondig [die ontdekking] aan.
 
Sterrekundiges kondig [die ontdekking] aan.
kondig NP aan. → announce NP
+
kondig NP aan. announce NP
   
 
Sterrekundiges druk [die ontdekking] onder.
 
Sterrekundiges druk [die ontdekking] onder.
druk NP onder. → suppress NP
+
druk NP onder. suppress NP
   
 
Sterrekundiges druk [die ontdekking].
 
Sterrekundiges druk [die ontdekking].
druk NP ø → press NP
+
druk NP ø press NP
 
</pre>
 
</pre>
   

Revision as of 18:00, 14 October 2007

Apertium may have some problems when dealing with separable verbs. Separable verbs are verbs that are formed with a verb stem, and a particle. For futher information see Wikipedia article here. These exist in most Germanic languages, and also languages such as Hungarian.

For example, in Afrikaans, the verb "to announce" is "aankondig". The usage is as follows:

  • Sterrekundiges kondig [die ontdekking] aan.
  • Astronomers announce [the discovery].

The stem "kondig" does not by itself mean anything, only in conjunction with the particle "aan", however this is not always the case. The past participle is formed by inserting "ge" in between the particle and the stem, for example:

  • Sterrekundiges het [die ontdekking] aangekondig.
  • Astronomers have announced [the discovery].

Currently Apertium has difficulty supporting this kind of feature in the morphological dictionaries.

Possible solutions

Several paradigms

Currently in the Afrikaans-English pair, separable verbs are dealt with as follows: Three paradigms are defined for verbs. The first is a list of possible particles/affixes (for example, "aan", "op", "onder", ...), the second is the "ge" past tense marker, the third is the standard verb ending paradigm.

So, for each separable verb, the definition looks something like:

  <e lm="kondig"><par n="attached__particles"/><par n="ge__past"/><i>kondig</i><par n="breek__vblex"/>

This allows us to analyse:

  • aankondig (announce)
  • aangekondig (announced)
  • kondig (announce) — Note: this is incorrect!

However, in an example such as above, where the "aan" portion is moved after the noun phrase in the sentence, we cannot analyse this, we instead rely on the fact that "kondig" does not have a meaning without "aan". Unfortunately this is not always the case...

Take for example, the verbs "onderdruk" and "druk". The former means "to suppress", the latter means "to press" or "to squeeze". So when we try and translate "onderdruk" → "suppress", instead we get "press under", or "squeeze under". This is not a good translation in this instance (although in many cases it can work, viz. "terugkry" → "kry terug" → "get back").

Furthermore, we cannot define "druk" as "suppress" and simply let the particle take care of itself, because "druk" has another meaning.

Infix paradigm

We could also consider using an infix paradigm. This is differently unclean from the other method. So for example, we would have a paradigm like ge__pref:

  <pardef n="ge__pref">
    <e lm="ge">
      <p>
        <l>ge</l>
        <r>ge</r>
       </p>
    </e>
    <e>
      <p>
        <l></l>
        <r></r>
      </p>
    </e>
  </pardef>

Note that in this case, we don't have any grammatical symbols on the right side. We then specify a multiword as follows:

  <e lm="wegloop"><i>weg</i><par n="ge__pref"/><i>loop</i><par n="breek__vblex"/></e>

This allows us to analyse:

  • wegloop (run away)
  • weggeloop (ran away)

This does not allow us to analyse simply "loop" (to run), we would need a separate paradigm for this. It also has the downside that both forms need to be specified in the bilingual dictionary, so for example:

  <e><p><l>run away</l><s n="vblex"/></l><r>wegloop<s n="vblex"/></r></p></e>
  <e><p><l>run away</l><s n="vblex"/><s n="past"/></l><r>weggeloop<s n="vblex"/></r></p></e>

It remains to be seen if the pay-off here, in having better translations is worth the cost in duplication of entries. Furthermore this still does not take care of "real separable" verbs.

Marking separable stems

If we mark the lemmata of verbs that can be used in separable contexts. We then use rules to say for example:

Sterrekundiges kondig [die ontdekking] aan.
               kondig NP               aan.   → announce NP

Sterrekundiges druk   [die ontdekking] onder.
               druk   NP               onder. → suppress NP

Sterrekundiges druk   [die ontdekking].
               druk   NP               ø      → press NP

This could be dealt with either in transfer or pre-transfer. If it was dealt with in pre-transfer,

^Sterrekundige<n><pl>$ ^kondig<vblex><pres><sep>$ ^die<det><def><sg>$ ^ontdekking<n><sg>$ ^aan<pr><sep>$^.<sent>$

Upon seeing the <sep> tag, the pre-transfer would chomp NPs until reaching either <sent> or an adverb, preposition, or whatever with another <sep> tag. Upon finding this tag, it would re-order the fragment thusly:

^Sterrekundige<n><pl>$ ^aankondig<vblex><pres>$ ^die<det><def><sg>$ ^ontdekking<n><sg>$^.<sent>$

The affix is put in its proper place before the verb, the <sep> tags are removed, and then the fragment is passed onto the transfer.

See also

Further reading

  • ten Hacken, P. and Bopp, S. (1998) "Separable Verbs in a Reusable Morphological Dictionary for German". Proceedings of the 36th annual meeting on Association for Computational Linguistics. pp. 471 - 475