Shallow syntactic function labeller

Architecture

1. The labeller takes a string in Apertium stream format with morphological tags:

^vino<n><m><sg>$ = INPUT

2. Parses it into a sequence of morphological tags:

<n><m><sg>

3. Restores the model for this language (which is in the same directory and looks like .json file or like a .pkl file)

4. The algorithm analyzes the string and gives a sequence of syntactic tags as an output.

<@nsubj>

5. The labeller applies given labels to the original string:

^vino<n><m><sg><@nsubj>$ = OUTPUT

So, in the end there will be a module itself and a file with a model.

Week	Dates	To do
1	30th May — 5th June	Handling possible discrepancies between tagsets, writing a script for parsing Sami corpus.
2	6th June — 12th June
3	13th June — 19th June
4	20th June — 26th June
First evaluation	Ready-to-use datasets
5	27th June — 3rd July	Building the model
6	4th July — 10th July	Training the classifier Evaluating the quality of the prototype
7	11th July — 17th July	Further training Working on improvements of the model
8	18th July — 24th July	Final testing Writing a script, which applies labels to the original string in Apertium stream format
Second evaluation	Well-trained model at least for North Sami
9	25th July — 31th July	Collecting all parts of the labeller together Adding machine-learned module instead of the syntax labelling part of sme-nob CG module to test it
10	1st August — 7th August	Adding machine-learned module instead of the syntax labelling part of sme-nob CG module to test it
11	8th August — 14th August	Testing Fixing bugs
12	15th August — 21th August	Cleaning up the code Writing documentation
Final evaluation	The prototype shallow syntactic function labeller.