Northern Sámi and Norwegian/Derivations
This page describes the general mechanism for handling derivations in sme-nob, then summarised the derivations handled (and how they are dealt with).
Contents
Derivations: general rules and exceptions[edit]
Sámi has a lot of derivation rules. There are various strategies used for translating these:
- We can lexicalise the sme derivation – this is the easiest, safest and leads to the best translations
- We can use a transfer rule which might add some periphrasis and/or change the form of the main word
- geavahit.V.Der/Actor.N.Pl → de som bruker
- lohkat.V.Der/goahti.1SG.Pret → begynte å lese
- We can put the full specific analysis in bidix to override the general translation, like translating geavahit.V.Der/Actor.N.Pl into
bruker<n><m><pl>
- DEPRECATED – we should move away from this method, it leads to too much transfer complexity, and might not even work if the bidix pardef also has the full path (making the bidix entry ambiguous). Much better to just lexicalise.
- We can tag-relabel so that the derivation looks like a compound
- DEPRECATED – we used to do this with Der/goahti (lohkagohten→lese+begynte→begynte å lese), but it works just fine with the regular transfer method, so no need to have yet another method.
Removing unhandled derivations from the analyser[edit]
Any derivations that are not handled we remove from the analyser with a twol negation rule in rm-deriv-cmp.twol
UnhandledDerivations /<= _ ; ! fail if analysis contains a tag from the set UnhandledDerivations
Derivations of derivations are removed with this rule (if these are ever needed, we should just lexicalise):
Derivation /<= Derivation+ PartOfSpeech+ _ ;
since double derivations are not handled either unless there are explicit transfer rules for them. This makes the lexicon a lot easier to handle for testvoc. Summary of fallbacks below contains the list of derivations that are and aren't handled.
Summary of fallbacks[edit]
(Tags used here are a bit outdated.)
- N.Der_Dimin.N
- N->N (diminutive), passes through as if nothing happened (although we could add "lille" or something)
- A.Der_vuohta.N
- Adj->N, gets adj.def, ("grunn"->"det grunne")
- typical override: "fattig"->"fattigdom"
- A.Der_at.Adv
- Adj->Adv, gets adj.posi.nt.sg.ind ("vid" -> "vidt")
- V.Der_goahti.V
- turns into two words, eg.
^lohkagohten/lohkat<V><TV><Der3>+goahti<V><Ind><Prt><Sg1>$
- turns into two words, eg.
- V.Der_PassL.V
- V->V (passive), add the "pass" tag, picked up by t1x verb rule
- V.Der_j.V.Der_PassL.V
- V->V (passive), add the "pass" tag, picked up by t1x verb rule
- TODO: what does Der_j add to the meaning? (ignored for now)
- Double derivation, has exception in dev/xfst2apertium.useless.twol (inner PoS tag removed in order to "flatten" it)
- V.Der_PassL.V.Der_n.N
- V->N (via passive), for now just outputs the plain passive infinitive (ideally should be able to enter noun phrase rules)
- Double derivation, has exception in dev/xfst2apertium.useless.twol (inner PoS tag removed in order to "flatten" it)
- V.Der_PassS.V
- V->V (passive), add the "pass" tag, picked up by t1x verb rule
- V.Der2.Der_halla.V
- V->V (passive), add the "pass" tag, picked up by t1x verb rule. Verb is tagged
<ill-av>
since the agent is illiative with halla-verbs
- V->V (passive), add the "pass" tag, picked up by t1x verb rule. Verb is tagged
- V.Der_h.V
- V->V (causative), add the "caus" tag, picked up by t1x verb rule
- V.Der_ahtti.V
- V->V (causative), add the "caus" tag, picked up by t1x verb rule
- V.Der_d.V
- V->V (reflexive), add the "ref" tag, picked up by t1x verb rule
- V.Der_alla.V
- V->V (reflexive), add the "ref" tag, picked up by t1x verb rule
- V.Der_st.V
- V->V (diminutive), passes through as if nothing happened (although we could add the adverb "litt"?)
- V.Der_n.N
- V->N, gets adj.pprs
- V.Der2.Der_las
- V->Adj, gets adj.pprs ("gi"->"givende")
- typical override: gi->generøs
- V.Actor.N
- V->N (actor), gets vblex.pres.m (could tag TD instead of pres, and select between pres and pret based on earlier finite verb, TODO)
- V.Der_j.Actor.N
- V->N (actor), as above
- TODO: what does Der/j add to the meaning? (ignored for now)
- V.Der_eapmi.N
- V->N (action/process), gets vblex.inf.nt
- typical override: "seile"->"seiling"
- V.Der_muš.N
- V->N (action/process), gets vblex.inf.nt
Der-tag hitparade[edit]
(counts of analyses, not forms, so probably a bit skewed, but gives an idea)
165550 <der_nomact> 103985 <der_nomag> 60906 <der_h> 50394 <der_laš> 49664 <der_dimin> 40246 <der_vuohta> 34868 <der_d> 29189 <der_passs> 23625 <der_passl> 17455 <der_st> 11718 <der_at> 11292 <der_heapmi> 10369 <der_alla> 5793 <der_l> 3542 <der_t> 3459 <der_ahtti> 3199 <der_muš> 2917 <der_halla> 2009 <der_huvva> 1355 <der_meahttun> 1208 <der_lágan> 978 <der_stuvva> 925 <der_las> 844 <der_a> 777 <der_upmi> 471 <der_saš> 421 <der_huhtti> 230 <der_adda> 150 <der_asti> 126 <der_veara> 44 <der_geahtes> 31 <der_easti> 22 <der_keahtta> 18 <der_adv> 16 <der_nammasaš> 8 <der_jagáš> 8 <der_ár> 4 <der_stávval> 4 <der_lágaš> 4 <der_dáfot> 3 <der_eamoš>
Derivation tags and their meanings[edit]
Note: Við eigum að breyta mörk neðan af því að það er ekki hægt að nota /
. í mörkum í apertium. En þá eigum við að breyta CG líka...
Tag | Type | Example | in Bokmål |
---|---|---|---|
Der/Dimin |
N→N[diminutive] |
mánáš "mánná" N Der1 Der/Dimin N Sg Nom | barn→lite barn |
Der/1 Der/st |
V→V[diminutive] |
attestit "addit" V TV Der1 Der/st V Inf | gi→gi litt |
Der/st |
Diminutive V→V |
oainnestit, várástit "várát" V TV Der1 Der/st V | se→skimte (add "litt"?) |
Der/adda |
V→N.PrfPrc.Actio |
bassaladdan "bassalit" V* TV Der2 Der/adda | →vaske tøy (bassat=vaske) |
Der/ahtti |
V→V |
vajálduhttit "vajálduvvat" V* IV* Der2 Der/ahtti V TV | →overse/glemme |
Der/alla |
suffix | bázáhallan "bázihit" V* TV Der2 Der/alla V Actio | → |
Der/amoš |
suffix | muitalamoš "muitalit" V TV Der3 Der/amoš N Sg Nom | fortelle→ |
Der/asti |
suffix | muitalastit "muitalit" V TV Der2 Der/asti V Inf | fortelle→ |
Der/at |
Adj→Adv |
viidát "viiddis" A* Der2 Der/at Adv | vid→vidt |
Der/d |
V→V[refl] |
basadit "bassat" V TV Der1 Der/d V | vaske→vaske seg |
Der/eaddji |
V→N.Actor |
muitaleaddji "muitalit" V TV Der2 Actor N Sg Nom | fortelle→forteller |
Der/eamoš |
suffix | muitaleamoš "muitalit" V* TV Der3 Der/eamoš | fortelle→ |
Der/eapmi |
V→N |
deaivvadeapmi "deaivvadit" V IV Der2 Der/eapmi N Sg Nom | møte(V)→møte(N), feire→feiring |
Der/easti |
suffix | muitaleastit "muitalit" V TV Der2 Der/easti V Inf | fortelle → |
Der/geahtes |
suffix | eaiggátkeahtes "eaiggát" N* Der3 Der/geahtes | eier → |
Der/goahti |
V→V Inchoative |
boradišgohten "boradit" V TV Der3 Der/goahti V Ind Prt Sg1 | spise → jeg begynte å spise |
Der/h |
suffix | geavaheaddji "geavvat" V* IV* Der1 Der/h V* TV Der2 Actor; orrohit "orrot" V* IV Der1 Der/h V | heve seg→ ; bli/synes→ |
Der/halla |
V→V[recip] |
gulahallat "gullat" V* TV Der1 Der2 Der/halla | høre→forstå hverandre («høre hverandre»?) |
Der/heapmi |
suffix | čađaheapmi "čađđa" N* Der1 Der2 Der/heapmi A | → |
Der/huhtti |
suffix | muosehuhttit "muoseheapme" A* Der1 Der/huhtti V* TV | urolig→ |
Der/huvva |
suffix | čađahuvvo "čađđa" N* Der1 Der2 Der/huvva V IV Imprt Prs ConNegII | → |
Der/j |
suffix | sáddejuvvot "sáddet" V* TV Der1 Der/j V* Der2 Der/PassL V | sende→ |
Der1 Der/l |
V→V[subitive] |
borralit "borralit" V TV Der1 Der/l V | spise→spise (i hast) |
Der/l |
???? | ohcalit "ohcat" V* TV Der1 Der/l V | lete→savne/lengte etter |
Der/las |
V→Adj |
addálas "addit" V TV Der1 Der2 Der/las A | gi→generøs |
Der/laš |
N→Adj |
dábálaš "dáhpi" N Der1 Der/laš A Sg Nom | skikk→vanlig |
Der/lágan |
suffix | earálágan "eará" Pron Indef Sg Gen Der1 Der/lágan A | annen/andre→ |
Der/meahttun |
V→Adj[Neg] |
jáhkkemeahttun "jáhkkit" V TV Der1 Der/meahttun A Sg Nom | tro/anta→utrolig |
Der/muš |
suffix | ??? "juhkat" V TV Der3 Der/muš N Sg Nom | drikke→ |
Der/n |
suffix | oažžun "oažžut" V* TV Der3 Der/n N | få→? |
Der/stuvva |
suffix | fuolastuvvat "fuollat" V* TV Der1 Der2 Der/stuvva V | bry seg om→ |
Der/supmi |
suffix | čállosupmi "čállit" V* TV Der2 Der/PassL V* Der3 Der/supmi N | skrive/...→ |
Der/upmi |
suffix | mearkkašupmi "mearkkašit" V* TV Der2 Der/PassL V* Der3 Der/upmi | merge seg→ |
Der/viđá |
suffix | málestanviđá "málet" V TV Der1 Der/st V Der2 Der/eapmi N SgCmp Der/viđá Adv | male→ |
Der/vuohta |
Adj→N |
ráhkisvuohta "ráhkis" A Der3 Der/vuohta N Sg Nom | kjær→kjærlighet |
Der/veara |
N→Adj |
mearkkašanveara "mearkkašeapmi" N SgCmp Der3 Der/veara A | merknad→markert? |