Difference between revisions of "Evaluating with Wikipedia"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
One of the ways of improving your MT system, and at the same time improve and add content in Wikipedias is to use Wikipedias as a test bed. You can translate text from one Wikipedia to another, then either post-edit yourself, or wait for, or ask other people to post-edit the text.
+
One of the ways of improving your MT system, and at the same time improve and add content in Wikipedias is to use Wikipedias as a test bed. You can translate text from one Wikipedia to another, then either post-edit yourself, or wait for, or ask other people to post-edit the text. One of the nice things is that MediaWiki (the software Wikipedia is based on) allows you to view diffs between the versions (see the 'history' tab).
   
 
This strategy is beneficial both to Wikipedia and to Apertium. Wikipedia gets new articles in languages which might not otherwise have them, and Apertium gets information on how we can improve the software. It is important to note that Wikipedia is a community effort, and that rightly people can be concerned about machine translation. To get an idea of this, put yourself in the place of people having to fix a lot of "hit and run" SYSTRAN translations, with little time and not much patience.
 
This strategy is beneficial both to Wikipedia and to Apertium. Wikipedia gets new articles in languages which might not otherwise have them, and Apertium gets information on how we can improve the software. It is important to note that Wikipedia is a community effort, and that rightly people can be concerned about machine translation. To get an idea of this, put yourself in the place of people having to fix a lot of "hit and run" SYSTRAN translations, with little time and not much patience.
Line 5: Line 5:
 
==Guidelines==
 
==Guidelines==
   
*Don´t just start translating texts and waiting for people to fix them. The first thing you should do, is create an account on the Wikipedia, and then find the "Community notice board". Ask there how regular contributors would feel about you using the Wikipedia for tests. The community notice board should be linked from the front page. It might be called something like "La tavèrna" in Occitan, or "Geselshoekie" in Afrikaans. When you are asking them, be sure to mention:
+
*Don't just start translating texts and waiting for people to fix them. The first thing you should do, is create an account on the Wikipedia, and then find the "Community notice board". Ask there how regular contributors would feel about you using the Wikipedia for tests. The community notice board should be linked from the front page. It might be called something like "La tavèrna" in Occitan, or "Geselshoekie" in Afrikaans. When you are asking them, be sure to mention:
** This is ´free software / open source´ machine translation.
+
** This is free software / open source machine translation.
 
** You would like to help the community and are doing these translations both to help their Wikipedia expand the range of articles, and to improve the translation software.
 
** You would like to help the community and are doing these translations both to help their Wikipedia expand the range of articles, and to improve the translation software.
 
** The translations will be added only with the consent of the community, you do not intend to flood them with poorly translated articles.
 
** The translations will be added only with the consent of the community, you do not intend to flood them with poorly translated articles.
 
** The translations will be added by a '''human''' not by a bot.
 
** The translations will be added by a '''human''' not by a bot.
 
** Ask them if there are any subjects that they prefer you would cover, perhaps they have a page of "requested translations".
 
** Ask them if there are any subjects that they prefer you would cover, perhaps they have a page of "requested translations".
** One way of presenting it might be as a non-native speaker of the language trying to learn the language. Point out that the initial translation will be done by machine, then you will try and fix the translation, but anything that you don´t fix you would be grateful for other people to fix.
+
** One way of looking at it might be as a non-native speaker of the language trying to learn the language. Point out that the initial translation will be done by machine, then you will try and fix the translation, but anything that you don't fix you would be grateful for other people to fix.
   
 
An example of the kind of conversation you might have is found [http://af.wikipedia.org/wiki/Wikipedia:Geselshoekie/MT here].
 
An example of the kind of conversation you might have is found [http://af.wikipedia.org/wiki/Wikipedia:Geselshoekie/MT here].

Revision as of 18:18, 24 November 2007

One of the ways of improving your MT system, and at the same time improve and add content in Wikipedias is to use Wikipedias as a test bed. You can translate text from one Wikipedia to another, then either post-edit yourself, or wait for, or ask other people to post-edit the text. One of the nice things is that MediaWiki (the software Wikipedia is based on) allows you to view diffs between the versions (see the 'history' tab).

This strategy is beneficial both to Wikipedia and to Apertium. Wikipedia gets new articles in languages which might not otherwise have them, and Apertium gets information on how we can improve the software. It is important to note that Wikipedia is a community effort, and that rightly people can be concerned about machine translation. To get an idea of this, put yourself in the place of people having to fix a lot of "hit and run" SYSTRAN translations, with little time and not much patience.

Guidelines

  • Don't just start translating texts and waiting for people to fix them. The first thing you should do, is create an account on the Wikipedia, and then find the "Community notice board". Ask there how regular contributors would feel about you using the Wikipedia for tests. The community notice board should be linked from the front page. It might be called something like "La tavèrna" in Occitan, or "Geselshoekie" in Afrikaans. When you are asking them, be sure to mention:
    • This is free software / open source machine translation.
    • You would like to help the community and are doing these translations both to help their Wikipedia expand the range of articles, and to improve the translation software.
    • The translations will be added only with the consent of the community, you do not intend to flood them with poorly translated articles.
    • The translations will be added by a human not by a bot.
    • Ask them if there are any subjects that they prefer you would cover, perhaps they have a page of "requested translations".
    • One way of looking at it might be as a non-native speaker of the language trying to learn the language. Point out that the initial translation will be done by machine, then you will try and fix the translation, but anything that you don't fix you would be grateful for other people to fix.

An example of the kind of conversation you might have is found here.

Current collaborations

If you´d like to know more about contributing to Wikipedia with Apertium, you can ask people below: