Difference between revisions of "Corpus based preposition selection - HOWTO"
Jump to navigation
Jump to search
Fpetkovski (talk | contribs) |
Fpetkovski (talk | contribs) |
||
Line 7: | Line 7: | ||
* Use the trained model in the pipeline |
* Use the trained model in the pipeline |
||
− | The general toolkit for performing these tasks can be found |
+ | The general toolkit for performing these tasks can be found [https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/ here]. |
− | [https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/fpetkovski/morph-parser/ toolkit] |
||
=== Extracting training data for your classifier === |
=== Extracting training data for your classifier === |
||
For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool. |
For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool. |
Revision as of 16:43, 20 August 2012
The general algorithm for performing corpus based preposition selection is as follows:
- Download a parallel corpus
- Extract patterns which contain prepositions from the source-language corpus
- Align the patterns to their translations in the target-language corpus
- Extract the features and label (the correct preposition from the target-language corpus) for classification.
- Train a model
- Use the trained model in the pipeline
The general toolkit for performing these tasks can be found here.
Extracting training data for your classifier
For the purpose of extracting training data for your classifier, you can use the preposition-extraction tool.