Difference between revisions of "Ideas for Google Summer of Code/Unsupervised weighting of automata"
Jump to navigation
Jump to search
Line 11: | Line 11: | ||
** Use the Apertium morphological analyser to analyse the test data |
** Use the Apertium morphological analyser to analyse the test data |
||
** Rank the analyses produced using the training data |
** Rank the analyses produced using the training data |
||
− | ** Compare this ranking to the default order from the transducer, and to a "random" ranking |
+ | ** Compare this ranking to the default order from the transducer, and to a "random" ranking using your metric |
Revision as of 17:05, 29 March 2017
Coding challenge
- Install HFST
- Install lttoolbox
- Define an evaluation metric --- talk to your mentor
- Perform a baseline experiment using a tagged corpus:
- Select a language
- Split the corpus into 90% training, 10% testing (or use existing test/train split)
- Use the Apertium morphological analyser to analyse the test data
- Rank the analyses produced using the training data
- Compare this ranking to the default order from the transducer, and to a "random" ranking using your metric