Difference between revisions of "Northern Sámi and Norwegian/bidix"

From Apertium
Jump to navigation Jump to search
Line 33: Line 33:
 
</pre>
 
</pre>
 
(other parts of speech for the passive predicates are currently TODO-marked in bidix)
 
(other parts of speech for the passive predicates are currently TODO-marked in bidix)
  +
  +
The <code>deverbal__n</code> pardef is used to give lemma-specific overrides for the derivations (Der2.Actor, Der3.Der_n) which turn verbs into nouns:
  +
<pre>
  +
<e><p><l>geavahit<s n="V"/><s n="TV"/></l><r>bruker<s n="n"/><s n="m"/></r></p><par n="deverbal__n"/></e>
  +
</pre>
  +
(see [[Northern_S%C3%A1mi_and_Norwegian#Derivations:_general_rules_and_exceptions]])
   
 
It's up to transfer (mainly the chunker, t1x) to make sense of and clean up these tag combinations.
 
It's up to transfer (mainly the chunker, t1x) to make sense of and clean up these tag combinations.

Revision as of 18:25, 10 June 2010

The apertium-sme-nob bidix makes heavy use of bidix pardefs. There are two main uses for these:

  • To change from sme PoS tags to nob PoS tags
  • To mark certain sme verbs as inherently passive/causative/reflexive
    • these markings again triggers certain transfer rules, most of them in the chunker (t1x)

The most complex part of the bidix is probably the verb section. A typical one looks like:

<e><p><l>vurket<s n="V"/><s n="TV"/></l><r>oppbevare<s n="vblex"/><s n="pers"/></r></p><par n="__verb"/></e>

where "pers" marks that the agent is typically animate, and __verb handles the changes in tags for person, number, temps. However, we can also have another pardef which does the same thing but also adds a causative tag "caus" which is picked up by transfer:

<e><p><l>divuhit<s n="V"/><s n="TV"/></l><r>reparere<s n="vblex"/><s n="pers"/></r></p><par n="caus__verb"/></e>

Here transfer will try to make a causative construction with this verb, by prepending "la" and putting the finite temps there while making the verb infinite.

Similarly, with

<e><p><l>viidánit<s n="V"/><s n="IV"/></l><r>spre<s n="vblex"/><s n="pers"/></r></p><par n="refl__verb"/></e>

we get a reflexive (seg/meg/...) appended by transfer on seeing the "refl" tag added by refl__verb.

With

<e><p><l>suovganit<s n="V"/><s n="IV"/></l><r>slite<s n="vblex"/><s n="pers"/></r></p><par n="pass__verb"/></e>

we get a "pass" tag and a passive construction, with a participle (here: bli slitt). However, with the passive, the predicate might also be an adjective, which we mark like this:

<e><p><l>viessat<s n="V"/><s n="IV"/></l><r>trøtt<s n="adj"/><s n="pers"/></r></p><par n="pass__verb"/></e>

(other parts of speech for the passive predicates are currently TODO-marked in bidix)

The deverbal__n pardef is used to give lemma-specific overrides for the derivations (Der2.Actor, Der3.Der_n) which turn verbs into nouns:

<e><p><l>geavahit<s n="V"/><s n="TV"/></l><r>bruker<s n="n"/><s n="m"/></r></p><par n="deverbal__n"/></e>

(see Northern_Sámi_and_Norwegian#Derivations:_general_rules_and_exceptions)

It's up to transfer (mainly the chunker, t1x) to make sense of and clean up these tag combinations.