Difference between revisions of "Speeding up monodix creation"

From Apertium
Jump to navigation Jump to search
(New page: This page outlines some ideas for increasing the speed at which monolingual dictionaries (analysers) can be created. ==Extract== ==Tag transfer== Category:Documentation)
 
Line 6: Line 6:
==Tag transfer==
==Tag transfer==


Try this at some point:
<pre>
<spectie> you have an aligned corpus
<spectie> polish--czech, czech--slovak, danish--swedish
<spectie> and you have an analyser for polish, czech or danish
<spectie> you want to make an analyser for swedish
<spectie> you make templates from the paradigms in the danish analyser
<spectie> tag the danish of the corpus
<spectie> that you have
<spectie> align it with the swedish side
<spectie> then read off the alignments, taking the surface forms from the right side and the tags from the left side
</pre>
[[Category:Documentation]]
[[Category:Documentation]]

Revision as of 22:09, 9 April 2008

This page outlines some ideas for increasing the speed at which monolingual dictionaries (analysers) can be created.

Extract

Tag transfer

Try this at some point:

<spectie> you have an aligned corpus
<spectie> polish--czech, czech--slovak, danish--swedish
<spectie> and you have an analyser for polish, czech or danish
<spectie> you want to make an analyser for swedish
<spectie> you make templates from the paradigms in the danish analyser
<spectie> tag the danish of the corpus
<spectie> that you have
<spectie> align it with the swedish side
<spectie> then read off the alignments, taking the surface forms from the right side and the tags from the left side