Difference between revisions of "Ideas for Google Summer of Code/Unsupervised weighting of automata"
Jump to navigation
Jump to search
TommiPirinen (talk | contribs) (Created page with "foo") |
|||
Line 1: | Line 1: | ||
+ | |||
− | foo |
||
+ | |||
+ | ==Coding challenge== |
||
+ | |||
+ | * Install [[HFST]] |
||
+ | * Install [[lttoolbox]] |
||
+ | * Define an evaluation metric |
||
+ | * Perform a baseline experiment using a tagged corpus: |
||
+ | ** Select a language |
||
+ | ** Split the corpus into 90% training, 10% testing (or use existing test/train split) |
||
+ | ** Use the Apertium morphological analyser to analyse the test data |
||
+ | ** Rank the analyses produced using the training data |
||
+ | ** Compare this ranking to the default order from the transducer, and to a "random" ranking |
||
+ | |||
+ | |||
+ | |||
+ | |||
+ | [[Category:Ideas for Google Summer of Code|Unsupervised weighting of automata]] |
Revision as of 17:02, 29 March 2017
Coding challenge
- Install HFST
- Install lttoolbox
- Define an evaluation metric
- Perform a baseline experiment using a tagged corpus:
- Select a language
- Split the corpus into 90% training, 10% testing (or use existing test/train split)
- Use the Apertium morphological analyser to analyse the test data
- Rank the analyses produced using the training data
- Compare this ranking to the default order from the transducer, and to a "random" ranking