Difference between revisions of "Ideas for Google Summer of Code/Unsupervised weighting of automata"

From Apertium
Jump to navigation Jump to search
Line 5: Line 5:
 
* Install [[HFST]]
 
* Install [[HFST]]
 
* Install [[lttoolbox]]
 
* Install [[lttoolbox]]
* Define an evaluation metric
+
* Define an evaluation metric --- talk to your mentor
 
* Perform a baseline experiment using a tagged corpus:
 
* Perform a baseline experiment using a tagged corpus:
 
** Select a language
 
** Select a language

Revision as of 17:05, 29 March 2017


Coding challenge

  • Install HFST
  • Install lttoolbox
  • Define an evaluation metric --- talk to your mentor
  • Perform a baseline experiment using a tagged corpus:
    • Select a language
    • Split the corpus into 90% training, 10% testing (or use existing test/train split)
    • Use the Apertium morphological analyser to analyse the test data
    • Rank the analyses produced using the training data
    • Compare this ranking to the default order from the transducer, and to a "random" ranking