Difference between revisions of "Corpus based preposition selection - HOWTO"

Revision as of 16:40, 20 August 2012

The general algorithm for performing corpus based preposition selection is as follows:

Download a parallel corpus
Extract patterns which contain prepositions from the source-language corpus
Align the patterns to their translations in the target-language corpus
Extract the features and label (the correct preposition from the target-language corpus) for classification.
Train a model
Use the trained model in the pipeline

The general toolkit for performing these tasks can be found here: apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/

Extracting training data for your classifier

For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool.

@@ Line 1: / Line 1: @@
+The general algorithm for performing corpus based preposition selection is as follows:
+* Download a parallel corpus
+* Extract patterns which contain prepositions from the source-language corpus
+* Align the patterns to their translations in the target-language corpus
+* Extract the features and label (the correct preposition from the target-language corpus) for classification.
+* Train a model
+* Use the trained model in the pipeline
+The general toolkit for performing these tasks can be found here:
+apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/
 === Extracting training data for your classifier ===
+For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool.

Difference between revisions of "Corpus based preposition selection - HOWTO"

Revision as of 16:40, 20 August 2012

Extracting training data for your classifier

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools