Difference between revisions of "Evaluating with Wikipedia"

From Apertium
Jump to navigation Jump to search
Line 14: Line 14:


An example of the kind of conversation you might have is found [http://af.wikipedia.org/wiki/Wikipedia:Geselshoekie/MT here].
An example of the kind of conversation you might have is found [http://af.wikipedia.org/wiki/Wikipedia:Geselshoekie/MT here].

==Current collaborations==

* [[User:Francis Tyers]] is working with the [http://af.wikipedia.org Afrikaans Wikipedia]
* [[User:Carmentano]] is working with the [http://oc.wikipedia.org Occitan Wikipedia]

Revision as of 11:35, 23 November 2007

One of the ways of improving your MT system, and at the same time improve and add content in Wikipedias is to use Wikipedias as a test bed. You can translate text from one Wikipedia to another, then either post-edit yourself, or wait for, or ask other people to post-edit the text.

This strategy is beneficial both to Wikipedia and to Apertium. Wikipedia gets new articles in languages which might not otherwise have them, and Apertium gets information on how we can improve the software. It is important to note that Wikipedia is a community effort, and that rightly people can be concerned about machine translation. To get an idea of this, put yourself in the place of people having to fix a lot of "hit and run" SYSTRAN or Babelfish translations, with little time and not much patience.

Guidelines

  • Don´t just start translating texts and waiting for people to fix them. The first thing you should do, is create an account on the Wikipedia, and then find the "Community notice board". Ask there how regular contributors would feel about you using the Wikipedia for tests. The community notice board should be linked from the front page. It might be called something like "La tavèrna" in Occitan, or "Geselshoekie" in Afrikaans. When you are asking them, be sure to mention:
    • This is ´free software / open source´ machine translation.
    • You would like to help the community and are doing these translations both to help their Wikipedia expand the range of articles, and to improve the translation software.
    • The translations will be added only with the consent of the community, you do not intend to flood them with poorly translated articles.
    • The translations will be added by a human not by a bot.
    • Ask them if there are any subjects that they prefer you would cover, perhaps they have a page of "requested translations".
    • One way of presenting it might be as a non-native speaker of the language trying to learn the language. Point out that the initial translation will be done by machine, then you will try and fix the translation, but anything that you don´t fix you would be grateful for other people to fix.

An example of the kind of conversation you might have is found here.

Current collaborations