Northern Sámi and Norwegian

From Apertium
Jump to navigation Jump to search



Word order

sme not-V2 and nob V2

sme OV tendencies and no nob OV

nob particle verbs

Since we "just" need generation, we could do it with this method.

Definiteness in nob

Some contexts are relatively safe:

  • Attributive superlatives are definite, and have indefinite nouns: det.poss adj.sup n => det.poss adj.sup.def n
    1. min viktigste.def oppgave.ind
  • Predicative superlatives are almost always indefinite
      1. min oppgave.ind/oppgaven.def min er viktigst.ind
  • ...unless they have a definite determiner:
    1. min oppgave.ind er den viktigste.def

Others we have to guess.

Features we might be able to use: subject/object, theme/focus, prepositions?

  • Du dálkasis sáhtii leamaš ávki => Din(det.poss) medisin(ind) kan ha vært til nytte
  • Mánná oađđá => Barnet(def) sover (but is this ambiguous?)
  • Son lea čeahpes bárdni => Han er en(art) flink(ind) gutt(ind)
  • Dá livččii skeaŋka din čeahpes bárdnai => Her er en gave jeg kunne ønske å gi den(art) flinke(def) sønnen(def) deres(det.poss)

Sámi collective nouns are marked Coll, but there's no collective nor mass noun marking in nob.dix, so I guess that's not much help.


Case to preposition

This is for adverbial cases

Essive nouns (mánnán=>som barn) are ambiguous between sg and pl; can we just choose sg.ind all the time?

Case to object

Accusative objects are just translated. The issue here is definiteness.

Case to possessor phrase

  • Gen N goes to N-Def til Possessor; but for bokmål, Gen's N would be simpler and fine in most cases
    • for both expressions, definiteness is more or less trivial

Postposition/number case choice

We can remove genitive case which is due to a postposition or after a number (or turn it into accusative for a pronoun).

  • garra.ADJ dálkki.N.GEN geažil.PO[GEN] => på_grunn_av.PR dårlig.ADJ vær.N
  • guokte.NUM biilla.N.SG.GEN => to.NUM biler.N.PL


Subject-verb agreement to be removed.

Subject insertion from pro-drop

Pro-drop sentences should have subjects inserted, observing the nob V2 rule:

  • Topicalised sentences
    • X + V goes to X + V + subjpron
    • X + Neg + V goes to X + V + subjpron + ikke
  • Verb-initial sentences
    • V goes to subjpron + V
    • Neg + V goes to subjpron + V + ikke

We could do this by changing a variable in the movement interchunk stage based on whether the pattern matches a subject or not.


Negation is a verb in sme, an adverbial in nob.

  • Subj + Neg + ConNeg goes to Subj + Prs + ikke
  • Subj + Neg + PrfPrtc goes to Subj + Prt + ikke
  • Neg + Subj + ConNeg goes to Subj + Prs + ikke
  • Neg + Subj + PrfPrtc goes to Subj + Prt + ikke
  • X + Neg (+ Subj) + ConNeg goes to X + Prs + Subj + ikke
  • X + Neg (+ Subj) + PrfPrtc goes to X + Prt + Subj + ikke

Infinite verbforms

These are clause reducts, to be expanded to embedded sentences



Actio locative


Actio essive


Pre- and post positions

Postposition to preposition

Lexical selection


Clause types

Yes-no questions

Verb-initial yes-no questions are directly translated, with go removal

When other constituents are added

Relative clauses

sme relative pronoun into nob "som"

"som" may be deleted when the relative refers to

Passive clauses



POS disambiguation



Existential sentences

  • Insert det in the nob translation

leat => være / ha

leat may translate into either one of være or ha, wrong translations will become very odd.

  • Mánát leat boahtán skuvlii => Barnene har kommet til skolen
    • verb afterwards: har (well in this case "er" works, movement verb, but in general)
  • Dat lea sihke buorre ja heittot => Det er både bra og dårlig
  • deháleamos doaibma lea ofelastit geavaheaddjiid almmolaš bálvalusaide =>'s viktigste oppgave er å veivise brukere til offentlige tjenester
    • å afterwards: er
  • Mus lea oahpahus gaskkal guovtti ja njealji => Jeg har undervisning mellom to og fire
    • " is teaching between two and four"
  • Mus lea biepmu => Jeg har mat
    • " is food"
    • "Mus" is <Loc><@HAB> in both these;
  • Ii mus leat bahá vuoigŋa => Jeg er ikke besatt
    • "not.3SG is.CONNEG angry spirit"
    • counterexample to the two above...because of negation? or the adjective?
  • Mun lean buorre => Jeg er god
  • Son lea čeahpes bárdni => Han er en flink gutt
    • "Mun", "Son" are not <Loc><@HAB> ...
  • Mus lea gažaldat didjiide => Jeg har et spørsmål til dere

Possible transfer for the loc @HAB thing: In t2x, if we have @HAB leat @<SUBJ, we can pretend we have @subj er/har @obj. Should be able to do something like that with



Derivations: general rules and exceptions

Sámi has a lot of derivation rules; sometimes the derived words have lexicalised translations in Bokmål, like ráhkisvuohta→kjærlighet, these we treat as exceptions which have to be specified in bidix. Other times we can use a general rule, like lohkagohten→begynte.1SG å lese.

We have two strategies for handling the rule/exception situation.

  1. For the situation where we have many exceptions, we let the analysis be eg. geavaheaddjiid/geavahit<V><TV><Der2><Actor><N><Pl> and from here there are two paths
    1. either this specific analysis is in bidix, here translating into bruker<n><m><pl>, or
    2. we have to use a transfer rule, in this case translating into de som bruker
  2. For the situation where we have few exceptions, we use dev/xfst2apertium.relabel to split the analysis into two lexical units. Two lexical units can't be specified in bidix, so here
    1. exceptions have to be added to the .lexc file as if they were lexicalised, so they remain one lexical unit
    2. while general transfer rules now match a pattern of two lexical units

More detailed: Deverbal nouns

Sámi verbs can turn into nouns. We want to be able to put this explicitly into the bidix (eg. sometimes the nob noun is not even based on the nob verb), but if it's not in bidix we want to be able to fall back on a construction using the verb, so

  • from geavaheaddjiid/geavahit<V><TV><Der2><Actor><N>
  • with fallback => de som bruker<vblex> (or something)
  • bidix specified => bruker<n><m>

With the following bidix entries we specify that we want bruker<n><m> in the above example:

    <e><p><l>geavahit<s n="V"/><s n="TV"/></l><r>bruke<s n="vblex"/><s n="pers"/></r></p><par n="__verb"/></e>
    <e><p><l>geavahit<s n="V"/><s n="TV"/><s n="Der2"/><s n="Actor"/><s n="N"/></l><r>bruker<s n="n"/><s n="m"/></r></p><par n="__n"/></e>

while if the second bidix line isn't there, we get the fallback. Transfer rules can now check

 <equal><clip side="tl" part="pos" ...><lit-tag v="N"/></equal>
 <equal><clip side="sl" part="pos" ...><lit-tag v="V"/></equal>

The same specification/fallback might be applied with other Derivations.


Note: Við eigum að breyta mörk neðan af því að það er ekki hægt að nota /. í mörkum í apertium. En þá eigum við að breyta CG líka...

There are also derivations of derivations:

          "geavvat" V* IV* Der1 Der/h V* TV Der2 Actor N Sg Acc PxSg3

For transfer purposes it might be simplest to treat these "flatly" as if they were single derivations (ie. Der1_Der_h_V_TV_Der2).

Tag Type Notes
Der/Dimin Diminutive mánáš mánná+N+Der1+Der/Dimin+N+Sg+Nom - small child
Der/adda suffix bassaladdan "bassalit" V* TV Der2 Der/adda
Der/ahtti suffix vajálduhttit "vajálduvvat" V* IV* Der2 Der/ahtti
Der/alla suffix bázáhallan "bázihit" V* TV Der2 Der/alla V Actio
Der/amoš suffix
Der/asti suffix
Der/aš suffix
Der/at suffix viidát "viiddis" A* Der2 Der/at Adv
Der/d suffix iskkadeapmi "iskat" V* TV Der1 Der/d V* Der2 Der/eapmi / orrodit "orrot" V* IV Der1 Der/h V
Der/eaddji suffix
Der/eamoš suffix muitaleamoš "muitalit" V* TV Der3 Der/eamoš
Der/eapmi V→N deaivadit - meet, deaivvadeapmi - meeting
Der/easti suffix
Der/geahtes suffix eaiggátkeahtes "eaiggát" N* Der3 Der/geahtes
Der/goahti V→V Inchoative: "Boradišgohten": "Jeg begynte å holde måltid"
Der/h suffix geavaheaddji "geavvat" V* IV* Der1 Der/h V* TV Der2 Actor / orrohit "orrot" V* IV Der1 Der/h V
Der/halla suffix gulahallat "gullat" V* TV Der1 Der2 Der/halla
Der/heapmi suffix čađaheapmi "čađđa" N* Der1 Der2 Der/heapmi A
Der/huhtti suffix muosehuhttit "muoseheapme" A* Der1 Der/huhtti V* TV
Der/huvva suffix čađahuvvo "čađđa" N* Der1 Der2 Der/huvva V IV Imprt Prs ConNegII
Der/j suffix sáddejuvvot "sáddet" V* TV Der1 Der/j V* Der2 Der/PassL V
Der/l suffix ohcalit "ohcat" V* TV Der1 Der/l V
Der/las suffix lotnolas "lotnut" V* TV Der1 Der2 Der/las A
Der/laš suffix dábálaš dáhpi+N+Der1+Der/laš+A+Sg+Nom - regular
Der/lágan suffix earálágan "eará" Pron Indef Sg Gen Der1 Der/lágan A
Der/meahttun suffix jáhkkemeahttun jáhkkit+V+TV+Der1+Der/meahttun+A+Sg+Nom
Der/muš suffix juhkat+V+TV+Der3+Der/muš+N+Sg+Nom - drink
Der/n suffix oažžun "oažžut" V* TV Der3 Der/n N
Der/st suffix várástit
Der/stuvva suffix fuolastuvvat "fuollat" V* TV Der1 Der2 Der/stuvva V
Der/supmi suffix čállosupmi "čállit" V* TV Der2 Der/PassL V* Der3 Der/supmi N
Der/upmi suffix mearkkašupmi "mearkkašit" V* TV Der2 Der/PassL V* Der3 Der/upmi
Der/us suffix
Der/viđá suffix
Der/vuohta suffix ráhkisvuohta ráhkis+A+Der3+Der/vuohta+N+Sg+Nom - love
Der/vuolde suffix
Der/š suffix

See also

External links