Difference between revisions of "Northern Sámi and Norwegian"

From Apertium
Jump to navigation Jump to search
Line 20: Line 20:


* Mus lea oahpahus gaskkal guovtti ja njealji => Jeg har undervisning mellom to og fire
* Mus lea oahpahus gaskkal guovtti ja njealji => Jeg har undervisning mellom to og fire
** "to.me is teaching between two and four"
** "from.me is teaching between two and four"
* Mus lea biepmu => Jeg har mat
* Mus lea biepmu => Jeg har mat
** "to.me is food"
** "from.me is food"
** "Mus" is <code><Loc><@HAB></code> in both these;
** "Mus" is <code><Loc><@HAB></code> in both these;
* Ii mus leat bahá vuoigŋa => Jeg er ikke besatt
* Ii mus leat bahá vuoigŋa => Jeg er ikke besatt
** not.3sg to.me is.conneg angry spirit
** "not.3SG from.me is.CONNEG angry spirit"
** counter to the two above...because of negation? or the adjective?
** counterexample to the two above...because of negation? or the adjective?


* Mun lean buorre => Jeg er god
* Mun lean buorre => Jeg er god
Line 33: Line 33:


* Mus lea gažaldat didjiide => Jeg har et spørsmål til dere
* Mus lea gažaldat didjiide => Jeg har et spørsmål til dere

Possible transfer for the loc @HAB thing:
In t2x, if we have <code>@HAB leat @<SUBJ</code>, we can pretend we have <code>@subj er/har @obj</code>. Should be able to do something like that with
<pre>
^Ii/Ii<V><IV><Neg><Ind><Sg3><@+FAUXV>$
^mus/mun<Pron><Pers><Sg1><Loc><@HAB>$
^leat/leat<V><IV><Ind><Prs><ConNeg><@-FMAINV>$
^bahá/bahá<A><Sg><Nom><@←SPRED>$
^vuoigŋa/vuoigŋa<N><Sg><Nom><@←SUBJ>$
</pre>
too.


==Derivations: general rules and exceptions==
==Derivations: general rules and exceptions==

Revision as of 16:38, 13 January 2010

Problems

Generating definiteness

Features we might be able to use: subject/object, theme/focus, prepositions?

  • Du dálkasis sáhtii leamaš ávki => Din(det.poss) medisin(ind) kan ha vært til nytte
  • Mánná oađđá => Barnet(def) sover (but is this ambiguous?)
  • Son lea čeahpes bárdni => Han er en(art) flink(ind) gutt(ind)
  • Dá livččii skeaŋka din čeahpes bárdnai => Her er en gave jeg kunne ønske å gi den(art) flinke(def) sønnen(def) deres(det.poss)

Sámi collective nouns are marked Coll, but there's no collective nor mass noun marking in nob.dix, so I guess that's not much help.

leat => være / ha

  • Mánát leat boahtán skuvlii => Barnene har kommet til skolen
    • verb afterwards: har (well in this case "er" works, movement verb, but in general)
  • Dat lea sihke buorre ja heittot => Det er både bra og dårlig
  • Norga.no deháleamos doaibma lea ofelastit geavaheaddjiid almmolaš bálvalusaide => Norge.no's viktigste oppgave er å veivise brukere til offentlige tjenester
    • å afterwards: er
  • Mus lea oahpahus gaskkal guovtti ja njealji => Jeg har undervisning mellom to og fire
    • "from.me is teaching between two and four"
  • Mus lea biepmu => Jeg har mat
    • "from.me is food"
    • "Mus" is <Loc><@HAB> in both these;
  • Ii mus leat bahá vuoigŋa => Jeg er ikke besatt
    • "not.3SG from.me is.CONNEG angry spirit"
    • counterexample to the two above...because of negation? or the adjective?
  • Mun lean buorre => Jeg er god
  • Son lea čeahpes bárdni => Han er en flink gutt
    • "Mun", "Son" are not <Loc><@HAB> ...
  • Mus lea gažaldat didjiide => Jeg har et spørsmål til dere

Possible transfer for the loc @HAB thing: In t2x, if we have @HAB leat @<SUBJ, we can pretend we have @subj er/har @obj. Should be able to do something like that with

^Ii/Ii<V><IV><Neg><Ind><Sg3><@+FAUXV>$
^mus/mun<Pron><Pers><Sg1><Loc><@HAB>$
^leat/leat<V><IV><Ind><Prs><ConNeg><@-FMAINV>$
^bahá/bahá<A><Sg><Nom><@←SPRED>$
^vuoigŋa/vuoigŋa<N><Sg><Nom><@←SUBJ>$

too.

Derivations: general rules and exceptions

Sámi has a lot of derivation rules; sometimes the derived words have lexicalised translations in Bokmål, like ráhkisvuohta→kjærlighet, these we treat as exceptions which have to be specified in bidix. Other times we can use a general rule, like lohkagohten→begynte.1SG å lese.

We have two strategies for handling the rule/exception situation.

  1. For the situation where we have many exceptions, we let the analysis be eg. geavaheaddjiid/geavahit<V><TV><Der2><Actor><N><Pl> and from here there are two paths
    1. either this specific analysis is in bidix, here translating into bruker<n><m><pl>, or
    2. we have to use a transfer rule, in this case translating into de som bruker
  2. For the situation where we have few exceptions, we use dev/xfst2apertium.relabel to split the analysis into two lexical units. Two lexical units can't be specified in bidix, so here
    1. exceptions have to be added to the .lexc file as if they were lexicalised, so they remain one lexical unit
    2. while general transfer rules now match a pattern of two lexical units

More detailed: Deverbal nouns

Sámi verbs can turn into nouns. We want to be able to put this explicitly into the bidix (eg. sometimes the nob noun is not even based on the nob verb), but if it's not in bidix we want to be able to fall back on a construction using the verb, so

  • from geavaheaddjiid/geavahit<V><TV><Der2><Actor><N>
  • with fallback => de som bruker<vblex> (or something)
  • bidix specified => bruker<n><m>

With the following bidix entries we specify that we want bruker<n><m> in the above example:

    <e><p><l>geavahit<s n="V"/><s n="TV"/></l><r>bruke<s n="vblex"/><s n="pers"/></r></p><par n="__verb"/></e>
    <e><p><l>geavahit<s n="V"/><s n="TV"/><s n="Der2"/><s n="Actor"/><s n="N"/></l><r>bruker<s n="n"/><s n="m"/></r></p><par n="__n"/></e>

while if the second bidix line isn't there, we get the fallback. Transfer rules can now check

 <equal><clip side="tl" part="pos" ...><lit-tag v="N"/></equal>
 <equal><clip side="sl" part="pos" ...><lit-tag v="V"/></equal>

The same specification/fallback might be applied with other Derivations.

Derivations

Note: Við eigum að breyta mörk neðan af því að það er ekki hægt að nota /. í mörkum í apertium. En þá eigum við að breyta CG líka...

Tag Type Notes
Der/Dimin Diminutive mánáš mánná+N+Der1+Der/Dimin+N+Sg+Nom - small child
Der/adda suffix
Der/ahtti suffix
Der/alla suffix
Der/amoš suffix
Der/asti suffix
Der/aš suffix
Der/d suffix
Der/duohkai suffix
Der/duohke suffix
Der/eaddji suffix
Der/eamoš suffix
Der/eapmi V→N deaivadit - meet, deaivvadeapmi - meeting
Der/easti suffix
Der/g suffix
Der/geahtes suffix
Der/goahti V→V Inchoative: "Boradišgohten": "Jeg begynte å holde måltid"
Der/h suffix
Der/halla suffix
Der/hat suffix
Der/heapmi suffix
Der/hudda suffix
Der/huhtti suffix
Der/huvva suffix
Der/j suffix
Der/l suffix
Der/las suffix
Der/laš suffix dábálaš dáhpi+N+Der1+Der/laš+A+Sg+Nom - regular
Der/lágan suffix
Der/meahttun suffix jáhkkemeahttun jáhkkit+V+TV+Der1+Der/meahttun+A+Sg+Nom
Der/muš suffix juhkat+V+TV+Der3+Der/muš+N+Sg+Nom - drink
Der/n suffix
Der/st suffix
Der/stuvva suffix
Der/supmi suffix
Der/upmi suffix
Der/us suffix
Der/viđi suffix
Der/viđá suffix
Der/vuohta suffix ráhkisvuohta ráhkis+A+Der3+Der/vuohta+N+Sg+Nom - love
Der/vuolde suffix
Der/vuollai suffix
Der/vuolle suffix
Der/š suffix

See also

External links

  • Sametingets plenum pdf's with parallel (sme-nob) gov't text, choose eg. Publikasjoner-Møtebøker-Plenum