Difference between revisions of "Finnish"

From Apertium
Jump to navigation Jump to search
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
   
[[apertium-fin]] is in [https://github.com/flammie/apertium-fin flammie’s github]
+
* [[apertium-fin]] is a conversion from omorfi: [https://github.com/flammie/omorfi omorfi], large coverage, experimental
[[giella-fin]] is in [https://giellatekno.uit.no giellatekno repository]
+
* [[giella-fin]] is in [https://github.com/giellalt/lang-fin giellatekno repository], more stable
   
==See also==
+
== Grammar stuff ==
   
  +
Common formulas:
   
  +
=== Numeral fix ===
   
  +
People working on morphologically poor language falsely assume that you should use <code>sg/pl</code> distinction in numerals as a sort of agreement feature, i.e. assigning the word "one" to singular always and any other words to plural. Naturally in Finnish numerals have singular plural distinction like any other nominal: one beer is one.num.sg beer.n.sg and two beers is, well two.num.sg beer.n.sg.par (cause plurality in such numeral phrase is expressed with singular partitive cause why not), but then again two.num.pl beers.n.pl is two rounds of beers for you and at least one friend, and also one.num.pl beer.n.pl is one round of beers (but it's never one, but that's a different story).
=== Language pairs in the nursery ===
 
   
  +
<pre>
* [[North Saami and Finnish]]
 
  +
<def-macro n="number-mangler" npar="1">
  +
<choose>
  +
<!-- nunber mappigngs -->
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="sg"/></equal>
  +
</test>
  +
<let>
  +
<var n="number"/>
  +
<lit-tag v="sg"/>
  +
</let>
  +
</when>
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="pl"/></equal>
  +
</test>
  +
<let>
  +
<var n="number"/>
  +
<lit-tag v="sg"/>
  +
</let>
  +
</when>
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="sp"/></equal>
  +
</test>
  +
<let>
  +
<var n="number"/>
  +
<lit-tag v="sg"/>
  +
</let>
  +
</when>
  +
<!-- otherwise, sg -->
  +
<otherwise>
  +
<let>
  +
<clip pos="1" side="tl" part="a_number"/><lit-tag v="sg"/>
  +
</let>
  +
<let>
  +
<var n="number"/><lit-tag v="sg"/>
  +
</let>
  +
</otherwise>
  +
</choose>
  +
</def-macro>
  +
</pre>
   
=== Language pairs in the incubator ===
+
=== Negations ===
  +
  +
Finnish uses negation verb which needs to be translated from many languages from negation and verb together.
  +
  +
=== Possession structure ===
  +
  +
Finnish does not have idiomatic verb for possession, if apertium language to translate from has vbhaver it can be translated to omistaa initially and re-organised into adessive of owner and copula on.
  +
  +
=== Adposition to case suffix ===
  +
  +
Finnish uses semantic cases for what many e.g. IE languages use adpositions:
  +
  +
* houses -> talot
  +
* on houses -> taloissa
  +
* into houses -> taloihin
  +
  +
etc.
  +
  +
<pre>
  +
<section-def-macros>
  +
<def-macro n="adp-mangler" npar="1">
  +
<choose>
  +
<!-- adp to case mappigngs -->
  +
<!-- based on adp lexeme only -->
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="lem"/><lit v="I"/></equal>
  +
</test>
  +
<let>
  +
<var n="adpcase"/>
  +
<lit-tag v="ine"/>
  +
</let>
  +
<let>
  +
<var n="maybeadp"/>
  +
<lit v=""/>
  +
</let>
  +
</when>
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="lem"/><lit v="i"/></equal>
  +
</test>
  +
<let>
  +
<var n="adpcase"/>
  +
<lit-tag v="ine"/>
  +
</let>
  +
<let>
  +
<var n="maybeadp"/>
  +
<lit v=""/>
  +
</let>
  +
</when>
  +
<when>
  +
<test>
  +
<equal><clip pos="1" side="sl" part="lem"/><lit v="fra"/></equal>
  +
</test>
  +
<let>
  +
<var n="adpcase"/>
  +
<lit-tag v="ela"/>
  +
</let>
  +
<let>
  +
<var n="maybeadp"/>
  +
<lit v=""/>
  +
</let>
  +
</when>
  +
...
  +
...
  +
...
  +
<rule comment="adp noun">
  +
<pattern>
  +
<pattern-item n="adp"/>
  +
<pattern-item n="noun"/>
  +
</pattern>
  +
<action>
  +
<call-macro n="adp-mangler">
  +
<with-param pos="1"/>
  +
</call-macro>
  +
<out>
  +
<chunk name="adpnoun" case="caseFirstWord">
  +
<tags>
  +
<tag><lit-tag v="NP"/></tag>
  +
<tag><var n="adpcase"/></tag>
  +
</tags>
  +
<lu>
  +
<clip pos="2" side="tl" part="lem"/>
  +
<clip pos="2" side="tl" part="a_noun"/>
  +
<clip pos="2" side="tl" part="a_number"/>
  +
<var n="adpcase"/>
  +
</lu>
  +
<b pos="0"/>
  +
<lu>
  +
<var n="maybeadp"/>
  +
</lu>
  +
</chunk>
  +
</out>
  +
</action>
  +
</rule>
  +
  +
</pre>
  +
  +
The effective range of adposition to suffix mapping is a noun phrase:
  +
  +
* (a) green colourless dream -> vihreä väritön uni
  +
* in (a) green colourless dream -> vihreässä värittömässä unessa
  +
  +
but proper noun phrase is appositive:
  +
  +
* for Donald Trump -> Donald Trumpille
  +
* for president Trump -> presidentti Trumpille
  +
  +
in Nokia Finnish a reverse pattern is used (because computers cannot inflect variables a proxy word is inflected):
  +
  +
* user -> käyttäjä
  +
* to user Joe -> käyttäjälle Joe
  +
* file -> tiedosto
  +
* into file thesis.doc -> tiedostoon thesis.doc
  +
  +
it would not be incorrect to use proper inflection in these cases.
  +
  +
==See also==
   
* [[Finnish and Hungarian]]
 
* [[Basque and Finnish]]
 
* [[Finnish and Estonian]]
 
* [[Finnish and Udmurt]]
 
* [[Kven and Finnish]]
 
* [[Finnish and Komi]]
 
* [[Livonian and Finnish]]
 
* [[Hill Mari and Finnish]]
 
* [[Erzya Mordvin and Finnish]]
 
   
 
=== Languages in github ===
 
=== Languages in github ===
Line 29: Line 181:
 
=== Language pairs in github ===
 
=== Language pairs in github ===
   
  +
* [[North Saami and Finnish]]
 
* [[Finnish and English]]
 
* [[Finnish and English]]
 
* [[Finnish and German]]
 
* [[Finnish and German]]
 
* [[Olonetsian and Finnish]]
 
* [[Olonetsian and Finnish]]
  +
* [[Finnish and Hungarian]]
  +
* [[Basque and Finnish]]
  +
* [[Finnish and Estonian]]
  +
* [[Finnish and Udmurt]]
  +
* [[Kven and Finnish]]
  +
* [[Finnish and Komi]]
  +
* [[Livonian and Finnish]]
  +
* [[Hill Mari and Finnish]]
  +
* [[Erzya Mordvin and Finnish]]
   
   

Latest revision as of 18:05, 25 June 2021

Grammar stuff[edit]

Common formulas:

Numeral fix[edit]

People working on morphologically poor language falsely assume that you should use sg/pl distinction in numerals as a sort of agreement feature, i.e. assigning the word "one" to singular always and any other words to plural. Naturally in Finnish numerals have singular plural distinction like any other nominal: one beer is one.num.sg beer.n.sg and two beers is, well two.num.sg beer.n.sg.par (cause plurality in such numeral phrase is expressed with singular partitive cause why not), but then again two.num.pl beers.n.pl is two rounds of beers for you and at least one friend, and also one.num.pl beer.n.pl is one round of beers (but it's never one, but that's a different story).

    <def-macro n="number-mangler" npar="1">
      <choose>
        <!-- nunber mappigngs -->
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="sg"/></equal>
          </test>
          <let>
            <var n="number"/>
            <lit-tag v="sg"/>
          </let>
        </when>
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="pl"/></equal>
          </test>
          <let>
            <var n="number"/>
            <lit-tag v="sg"/>
          </let>
        </when>
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="a_number"/><lit-tag v="sp"/></equal>
          </test>
          <let>
            <var n="number"/>
            <lit-tag v="sg"/>
          </let>
        </when>
        <!-- otherwise, sg -->
        <otherwise>
          <let>
            <clip pos="1" side="tl" part="a_number"/><lit-tag v="sg"/>
          </let>
          <let>
            <var n="number"/><lit-tag v="sg"/>
          </let>
        </otherwise>
      </choose>
    </def-macro>

Negations[edit]

Finnish uses negation verb which needs to be translated from many languages from negation and verb together.

Possession structure[edit]

Finnish does not have idiomatic verb for possession, if apertium language to translate from has vbhaver it can be translated to omistaa initially and re-organised into adessive of owner and copula on.

Adposition to case suffix[edit]

Finnish uses semantic cases for what many e.g. IE languages use adpositions:

  • houses -> talot
  • on houses -> taloissa
  • into houses -> taloihin

etc.

  <section-def-macros>
    <def-macro n="adp-mangler" npar="1">
      <choose>
        <!-- adp to case mappigngs -->
        <!-- based on adp lexeme only -->
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="lem"/><lit v="I"/></equal>
          </test>
          <let>
            <var n="adpcase"/>
            <lit-tag v="ine"/>
          </let>
          <let>
            <var n="maybeadp"/>
            <lit v=""/>
          </let>
        </when>
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="lem"/><lit v="i"/></equal>
          </test>
          <let>
            <var n="adpcase"/>
            <lit-tag v="ine"/>
          </let>
          <let>
            <var n="maybeadp"/>
            <lit v=""/>
          </let>
        </when>
        <when>
          <test>
            <equal><clip pos="1" side="sl" part="lem"/><lit v="fra"/></equal>
          </test>
          <let>
            <var n="adpcase"/>
            <lit-tag v="ela"/>
          </let>
          <let>
            <var n="maybeadp"/>
            <lit v=""/>
          </let>
        </when>
...
...
...
    <rule comment="adp noun">
      <pattern>
        <pattern-item n="adp"/>
        <pattern-item n="noun"/>
      </pattern>
      <action>
        <call-macro n="adp-mangler">
          <with-param pos="1"/>
        </call-macro>
        <out>
          <chunk name="adpnoun" case="caseFirstWord">
            <tags>
              <tag><lit-tag v="NP"/></tag>
              <tag><var n="adpcase"/></tag>
            </tags>
            <lu>
              <clip pos="2" side="tl" part="lem"/>
              <clip pos="2" side="tl" part="a_noun"/>
              <clip pos="2" side="tl" part="a_number"/>
              <var n="adpcase"/>
            </lu>
            <b pos="0"/>
            <lu>
              <var n="maybeadp"/>
            </lu>
          </chunk>
        </out>
      </action>
    </rule>

The effective range of adposition to suffix mapping is a noun phrase:

  • (a) green colourless dream -> vihreä väritön uni
  • in (a) green colourless dream -> vihreässä värittömässä unessa

but proper noun phrase is appositive:

  • for Donald Trump -> Donald Trumpille
  • for president Trump -> presidentti Trumpille

in Nokia Finnish a reverse pattern is used (because computers cannot inflect variables a proxy word is inflected):

  • user -> käyttäjä
  • to user Joe -> käyttäjälle Joe
  • file -> tiedosto
  • into file thesis.doc -> tiedostoon thesis.doc

it would not be incorrect to use proper inflection in these cases.

See also[edit]

Languages in github[edit]

Language pairs in github[edit]