Finnish
Revision as of 05:13, 17 February 2021 by TommiPirinen (talk | contribs) (→Adposition to case suffix)
- apertium-fin is a conversion from omorfi: omorfi, large coverage, experimental
- giella-fin is in giellatekno repository, more stable
Contents
Grammar stuff
Common formulas:
Adposition to case suffix
Finnish uses semantic cases for what many e.g. IE languages use adpositions:
- houses -> talot
- on houses -> taloissa
- into houses -> taloihin
etc.
<section-def-macros>
<def-macro n="adp-mangler" npar="1">
<choose>
<!-- adp to case mappigngs -->
<!-- based on adp lexeme only -->
<when>
<test>
<equal><clip pos="1" side="sl" part="lem"/><lit v="I"/></equal>
</test>
<let>
<var n="adpcase"/>
<lit-tag v="ine"/>
</let>
<let>
<var n="maybeadp"/>
<lit v=""/>
</let>
</when>
<when>
<test>
<equal><clip pos="1" side="sl" part="lem"/><lit v="i"/></equal>
</test>
<let>
<var n="adpcase"/>
<lit-tag v="ine"/>
</let>
<let>
<var n="maybeadp"/>
<lit v=""/>
</let>
</when>
<when>
<test>
<equal><clip pos="1" side="sl" part="lem"/><lit v="fra"/></equal>
</test>
<let>
<var n="adpcase"/>
<lit-tag v="ela"/>
</let>
<let>
<var n="maybeadp"/>
<lit v=""/>
</let>
</when>
...
...
...
<rule comment="adp noun">
<pattern>
<pattern-item n="adp"/>
<pattern-item n="noun"/>
</pattern>
<action>
<call-macro n="adp-mangler">
<with-param pos="1"/>
</call-macro>
<out>
<chunk name="adpnoun" case="caseFirstWord">
<tags>
<tag><lit-tag v="NP"/></tag>
<tag><var n="adpcase"/></tag>
</tags>
<lu>
<clip pos="2" side="tl" part="lem"/>
<clip pos="2" side="tl" part="a_noun"/>
<clip pos="2" side="tl" part="a_number"/>
<var n="adpcase"/>
</lu>
<b pos="0"/>
<lu>
<var n="maybeadp"/>
</lu>
</chunk>
</out>
</action>
</rule>
The effective range of adposition to suffix mapping is a noun phrase:
- (a) green colourless dream -> vihreä väritön uni
- in (a) green colourless dream -> vihreässä värittömässä unessa
but proper noun phrase is appositive:
- for Donald Trump -> Donald Trumpille
- for president Trump -> presidentti Trumpille
in Nokia Finnish a reverse pattern is used (because computers cannot inflect variables a proxy word is inflected):
- user -> käyttäjä
- to user Joe -> käyttäjälle Joe
- file -> tiedosto
- into file thesis.doc -> tiedostoon thesis.doc
it would not be incorrect to use proper inflection in these cases.