Difference between revisions of "Speeding up monodix creation"

From Apertium
Jump to navigation Jump to search
(New page: This page outlines some ideas for increasing the speed at which monolingual dictionaries (analysers) can be created. ==Extract== ==Tag transfer== Category:Documentation)
 
Line 6: Line 6:
 
==Tag transfer==
 
==Tag transfer==
   
  +
Try this at some point:
  +
<pre>
  +
<spectie> you have an aligned corpus
  +
<spectie> polish--czech, czech--slovak, danish--swedish
  +
<spectie> and you have an analyser for polish, czech or danish
  +
<spectie> you want to make an analyser for swedish
  +
<spectie> you make templates from the paradigms in the danish analyser
  +
<spectie> tag the danish of the corpus
  +
<spectie> that you have
  +
<spectie> align it with the swedish side
  +
<spectie> then read off the alignments, taking the surface forms from the right side and the tags from the left side
  +
</pre>
 
[[Category:Documentation]]
 
[[Category:Documentation]]

Revision as of 22:09, 9 April 2008

This page outlines some ideas for increasing the speed at which monolingual dictionaries (analysers) can be created.

Extract

Tag transfer

Try this at some point:

<spectie> you have an aligned corpus
<spectie> polish--czech, czech--slovak, danish--swedish
<spectie> and you have an analyser for polish, czech or danish
<spectie> you want to make an analyser for swedish
<spectie> you make templates from the paradigms in the danish analyser
<spectie> tag the danish of the corpus
<spectie> that you have
<spectie> align it with the swedish side
<spectie> then read off the alignments, taking the surface forms from the right side and the tags from the left side