Difference between revisions of "Corpus based preposition selection - HOWTO"

Revision as of 16:43, 20 August 2012

The general algorithm for performing corpus based preposition selection is as follows:

Download a parallel corpus
Extract patterns which contain prepositions from the source-language corpus
Align the patterns to their translations in the target-language corpus
Extract the features and label (the correct preposition from the target-language corpus) for classification.
Train a model
Use the trained model in the pipeline

The general toolkit for performing these tasks can be found here.

For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool.

@@ Line 7: / Line 7: @@
 * Use the trained model in the pipeline
-The general toolkit for performing these tasks can be found here: <br />
+The general toolkit for performing these tasks can be found [https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/ here].
-[https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/ toolkit]
 === Extracting training data for your classifier ===
 For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool.