Ideas for Google Summer of Code/Unsupervised weighting of automata
Jump to navigation
Jump to search
Coding challenge
- Install HFST
- Install lttoolbox
- Define an evaluation metric --- talk to your mentor
- Perform a baseline experiment using a tagged corpus:
- Select a language
- Split the corpus into 90% training, 10% testing (or use existing test/train split)
- Use the Apertium morphological analyser to analyse the test data
- Rank the analyses produced using the training data
- Compare this ranking to the default order from the transducer, and to a "random" ranking