Difference between revisions of "Afrikaans and Dutch"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
==apertium-af-nl 0.1==
+
==The apertium-af-nl language pair==
 
The first version of the '''af-nl language pack''' is now in SVN -- and can be found in the [[attic]]. However, the Dutch (NL) morphological dictionary doesn't contain enough words to do proper translations. Here are some of the words that should be added to the NL dictionary (from the first sentence I've tried to translate):
 
The first version of the '''af-nl language pack''' is now in SVN -- and can be found in the [[attic]]. However, the Dutch (NL) morphological dictionary doesn't contain enough words to do proper translations. Here are some of the words that should be added to the NL dictionary (from the first sentence I've tried to translate):
   
Line 45: Line 45:
   
 
* Col 1 = Word/ Surface form
 
* Col 1 = Word/ Surface form
* col 2 = Analysis/ CGN tag for each word
+
* Col 2 = Analysis/ CGN tag for each word
 
** ADJ = Bijvoeglijk naamwoord (adjective)
 
** ADJ = Bijvoeglijk naamwoord (adjective)
 
** LID = Lidwoord (article)
 
** LID = Lidwoord (article)
Line 52: Line 52:
 
** VZ = Voorzetsel (preposition)
 
** VZ = Voorzetsel (preposition)
 
** WW = Werkwoord (verb)
 
** WW = Werkwoord (verb)
* col 3 = Lemmas
+
* Col 3 = Lemmas
* col 4 = Morphological segmentation/ Rule for inflection
+
* Col 4 = Morphological segmentation/ Rule for inflection
   
'''Further analysis'''
+
'''Further analysis:'''
   
 
<pre>
 
<pre>
Line 75: Line 75:
 
provincie; provincie; soort.ev.basis; n.f
 
provincie; provincie; soort.ev.basis; n.f
 
</pre>
 
</pre>
 
==Dictionaries==
 
* One might want to ask the [http://www.inl.nl/index.php?option=com_content&task=view&id=78&Itemid=201&lang=nl Instituut voor Nederlands Lexicologie] whether they would be willing to donate some of their word lists for the Dutch morphological dictionary.
 
   
 
==Links==
 
==Links==
Line 87: Line 84:
   
 
[[Category:Discussions]]
 
[[Category:Discussions]]
 
 
[[Category:Language pairs]]
 
[[Category:Language pairs]]

Revision as of 17:33, 10 July 2008

The apertium-af-nl language pair

The first version of the af-nl language pack is now in SVN -- and can be found in the attic. However, the Dutch (NL) morphological dictionary doesn't contain enough words to do proper translations. Here are some of the words that should be added to the NL dictionary (from the first sentence I've tried to translate):

  • trein
  • vertrekt
  • uur
  • Etc.

If you can help, please do let us know!

What is needed

To fix the Dutch morphological dictionary, we need a word list that contains forms or paradigms like this:

bier,bier,n.sg
bier,biere,n.pl

Where n=noun, sg=singular, pl=plural, etc.

A list that can be converted to this format will also do.

Tadpole morphological analyser

I've installed the Tadpole morphological analyser and are able to use it on text from nl.wp. The next step would be to see if the output has sufficient detail to be converted into the Apertium format.

Sample output:

Orania SPEC(deeleigen) Orania [Orania]
is WW(pv,tgw,ev) zijn [zijn]
een LID(onbep,stan,agr) een [een]
Zuid-Afrikaans SPEC(deeleigen) Zuid-Afrikaans [Zuid-Afrikaans]
dorpje N(soort,ev,dim,onz,stan) dorp [dorp][je]
gelegen WW(vd,vrij,zonder) liggen [ge][lig][en]
aan VZ(init) aan [aan]
de LID(bep,stan,rest) de [de]
Oranjerivier SPEC(deeleigen) Oranjerivier [Oranjerivier]
in VZ(init) in [in]
de LID(bep,stan,rest) de [de]
droge ADJ(prenom,basis,met-e,stan) droog [droog][e]
Karoostreek SPEC(deeleigen) Karoostreek [Karoostreek]
van VZ(init) van [van]
de LID(bep,stan,rest) de [de]
provincie N(soort,ev,basis,zijd,stan) provincie [provincie]
Noord-Kaap SPEC(deeleigen) Noord-Kaap [Noord-Kaap]

Key:

  • Col 1 = Word/ Surface form
  • Col 2 = Analysis/ CGN tag for each word
    • ADJ = Bijvoeglijk naamwoord (adjective)
    • LID = Lidwoord (article)
    • N = Zelfstandig naamwoord (noun)
    • SPEC = ???
    • VZ = Voorzetsel (preposition)
    • WW = Werkwoord (verb)
  • Col 3 = Lemmas
  • Col 4 = Morphological segmentation/ Rule for inflection

Further analysis:

droge     ADJ(prenom,basis,met-e,stan) droog  [droog][e] 

Surface   Analysis                     Lemma  Rule
 form                                         for inflection

Speling equiv: 
 
 droog; droge; prenom; adj

dorpje N(soort,ev,dim,onz,stan) dorp [dorp][je]
provincie N(soort,ev,basis,zijd,stan) provincie [provincie]

Speling equiv:

 dorpje; dorp; soort.ev.dim; n.m
 provincie; provincie; soort.ev.basis; n.f

Links