Ideas for Google Summer of Code/Shallow-function labeller

From Apertium
Jump to navigation Jump to search
<spectre> deltamachine_, yes, sorry that is my fault, i had the idea when falling asleep and didn't write much more 
<spectre> so 
<spectre> a dependency parser builds a whole tree and assigns labels to the tree
<spectre> a shallow-function labeller basically just assigns labels to words, without the tree
<spectre> e.g. a function labelled sentence might look something like:
<spectre>  
<spectre> I/@nsubj saw/@fmv the/@mod cat/@obj
<spectre>  
<spectre> so you get the function of the word, but not the exact tree structure
<spectre> it's an easier task
<spectre> in some ways
<spectre> because you don't have to resolve e.g. coordination ambiguity
<spectre> http://www.aclweb.org/anthology/E95-1029

Coding challenge

  • Write a script that takes a dependency treebank in UD format and "flattens" it, that is, applies the following transformations:
    • Words with the @conj relation take the label of their head
    • Words with the @parataxis relation take the label of their head
    • ...
  • Write a script that takes a sentence in Apertium stream format and for each surface form applies the most frequent label from the labelled corpus.