Ideas for Google Summer of Code/Unsupervised weighting of automata
< Ideas for Google Summer of Code
Jump to navigation
Jump to search
Revision as of 17:02, 29 March 2017 by Francis Tyers (talk | contribs)
Coding challenge
- Install HFST
- Install lttoolbox
- Define an evaluation metric
- Perform a baseline experiment using a tagged corpus:
- Select a language
- Split the corpus into 90% training, 10% testing (or use existing test/train split)
- Use the Apertium morphological analyser to analyse the test data
- Rank the analyses produced using the training data
- Compare this ranking to the default order from the transducer, and to a "random" ranking