Difference between revisions of "User:Pyry/Sandbox"

From Apertium
Jump to navigation Jump to search
Line 39: Line 39:


* MT between agglutinative closely-related languages: Turkish--{Tatar,Turkmen,...}


Revision as of 16:40, 27 October 2010

Challenges in Finnish to North Sámi rule-based machine translation
Translating the Bible from Finnish to North Sámi
Trials and tribulations in Finnish to North Sámi rule-based machine translation


Submission deadline: Nov 8

(13:38:36) francis: 1) underspecification in omorfi (e.g. cc/cs vs. conj)
(13:39:08) francis: 2) differing grammatical traditions (acc/gen??) merge in omorfi but not in GT
(13:39:28) francis: 3) overgeneration in sme
(13:40:27) francis: 4) sometimes it wasn't clear when words were assigned to defective paradigms (e.g. some pronouns?? didn't decline in some cases)
(13:41:20) francis: btw, we actually had a three way tagset disjunct
(13:41:29) francis: between omorfi, fred's CG and giellatekno
(13:44:30) ryan: one of the larger problems i thought was figuring out what exactly trying to do with compound words
(13:45:18) ryan: sankari probably 
(13:45:20) francis: i haveo ne 
(13:45:21) ryan: means hero 
(13:45:26) ryan: but it ended up with a compound analysis 
(13:45:27) ryan: san# kari 
(13:45:29) francis: saamelainen
(13:45:32) ryan: ooh, that too
(13:45:43) ryan: even more related ;) 
(13:46:18) francis: 6) differing lexicalisation

(13:46:25) francis: kritiserema	kritiseret+V+TV+Der3+Der/n+N+Sg+Acc  

vs. kritisoinnin	kritisointi+N+Sg+Gen



  • MT between agglutinative closely-related languages: Turkish--{Tatar,Turkmen,...}



  • HFST
  • Constraint Grammar
  • Apertium
Problematic aspects



Future work