Difference between revisions of "User:Kevin Scannell"
(update for 2010) |
|||
Line 1: | Line 1: | ||
I provided a lot of the data currently in Apertium for [[Irish to Scottish Gaelic]] machine translation, which was taken from an ad hoc system for this language pair that I created in 2005. I'd be interested in mentoring a GSOC student (or anyone else) to help finish this work. |
|||
⚫ | |||
Also, I recently finished some work on statistical diacritic restoration that would make a suitable GSOC project as well; see [http://borel.slu.edu/pub/lre.pdf Statistical Unicodification of African Languages]. |
|||
⚫ | For about ten years I have been working on developing language technology for under-resourced languages around the world. I've developed corpora and spell checkers for many (20+) languages using a [http://borel.slu.edu/crubadan/ web crawler], statistical methods, and contributions from native speakers. I am primarily interested in the Celtic languages, and particularly Irish. I've created a [http://borel.slu.edu/gramadoir/ grammar checker], monolingual and parallel corpora, and a [http://borel.slu.edu/lsg/ semantic network] for Irish, and do a lot of the localization of open source software (Firefox, OpenOffice.org, KDE, ...) Currently I am working on a statistical MT engine that I hope might be useful for Apertium down the road, if I ever manage to finish it! |
||
Links: |
Links: |
Revision as of 14:26, 11 March 2010
I provided a lot of the data currently in Apertium for Irish to Scottish Gaelic machine translation, which was taken from an ad hoc system for this language pair that I created in 2005. I'd be interested in mentoring a GSOC student (or anyone else) to help finish this work.
Also, I recently finished some work on statistical diacritic restoration that would make a suitable GSOC project as well; see Statistical Unicodification of African Languages.
For about ten years I have been working on developing language technology for under-resourced languages around the world. I've developed corpora and spell checkers for many (20+) languages using a web crawler, statistical methods, and contributions from native speakers. I am primarily interested in the Celtic languages, and particularly Irish. I've created a grammar checker, monolingual and parallel corpora, and a semantic network for Irish, and do a lot of the localization of open source software (Firefox, OpenOffice.org, KDE, ...) Currently I am working on a statistical MT engine that I hope might be useful for Apertium down the road, if I ever manage to finish it!
Links: