Ideas for Google Summer of Code/Apertium African
(Created page with "== Apertium English--Hausa/Igbo/Swahili/Tigrinya/Yoruba == African languages are not particularly well served by Apertium. The four languages listed are quite important, and ...")
Revision as of 20:36, 4 February 2019
African languages are not particularly well served by Apertium. The four languages listed are quite important, and are only currently served by commercial machine translation companies such as Google, which makes these language communities dependent on a specific commercial provider. The objective is to start these language pairs (which haven't been started or have currentlu very little data in Apertium) and write an usable version which provides intelligible output.
- Install a GNU/Linux system. There is an Apertium virtual machine you can install using VirtualBox.
- If necessary, install Apertium, the Occitan language data, the French language data, and the Apertium Occitan-French package
- Check out a language pair that may be similar, and build similar files for your English--(African language) system. See Apertium_New_Language_Pair_HOWTO.
- Add some minimal vocabulary and rules and check that they work. Ideally, select a few sentences that are translated with this vocabulary and show that they are translated correctly.
- If the language pair is already in the Apertium GitHub server, submit a pull request.
- If the language pair is not there, contact your mentor(s) so that they can start a repository for you to submit a pull request.