Northern Sámi and Norwegian/bidix

From Apertium
Jump to navigation Jump to search

The apertium-sme-nob bidix makes heavy use of bidix pardefs. There are two main uses for these:

  • To change from sme PoS tags to nob PoS tags
  • To mark certain sme verbs as inherently passive/causative/reflexive
    • these markings again triggers certain transfer rules, most of them in the chunker (t1x)

The most complex part of the bidix is probably the verb section. A typical one looks like:

<e><p><l>vurket<s n="V"/><s n="TV"/></l><r>oppbevare<s n="vblex"/><s n="pers"/></r></p><par n="__verb"/></e>

where "pers" marks that the agent is typically animate, and __verb handles the changes in tags for person, number, temps. However, we can also have another pardef which does the same thing but also adds a causative tag "caus" which is picked up by transfer:

<e><p><l>divuhit<s n="V"/><s n="TV"/></l><r>reparere<s n="vblex"/><s n="pers"/></r></p><par n="caus__verb"/></e>

Here transfer will try to make a causative construction with this verb, by prepending "la" and putting the finite temps there while making the verb infinite.

Similarly, with

<e><p><l>viidánit<s n="V"/><s n="IV"/></l><r>spre<s n="vblex"/><s n="pers"/></r></p><par n="refl__verb"/></e>

we get a reflexive (seg/meg/...) appended by transfer on seeing the "refl" tag added by refl__verb.

With

<e><p><l>suovganit<s n="V"/><s n="IV"/></l><r>slite<s n="vblex"/><s n="pers"/></r></p><par n="pass__verb"/></e>

we get a "pass" tag and a passive construction, with a participle (here: bli slitt). However, with the passive, the predicate might also be an adjective, which we mark like this:

<e><p><l>viessat<s n="V"/><s n="IV"/></l><r>trøtt<s n="adj"/><s n="pers"/></r></p><par n="pass__verb"/></e>

(other parts of speech for the passive predicates are currently TODO-marked in bidix)

The deverbal__n pardef is used to give lemma-specific overrides for the derivations (Der2.Actor, Der3.Der_n) which turn verbs into nouns:

<e><p><l>geavahit<s n="V"/><s n="TV"/></l><r>bruker<s n="n"/><s n="m"/></r></p><par n="deverbal__n"/></e>

(see Northern_Sámi_and_Norwegian#Derivations:_general_rules_and_exceptions)

It's up to transfer (mainly the chunker, t1x) to make sense of and clean up these tag combinations.