Difference between revisions of "Related software"
Line 4: | Line 4: | ||
Target-language driven part-of-speech tagger trainer |
Target-language driven part-of-speech tagger trainer |
||
apertium-tagger-training-tools is an open-source package that can be used to train (in an unsupervised way) hidden-Markov-model-based part-of-speech taggers involved in machine translation. While training information, not only from the source language, but also from the target language is used; to this end the Apertium MT toolbox is used. After training a file containing the hidden-Markov-model parameters is produced; this file can be directly used within the Apertium MT toolbox. This package may simplify the initial building of an Apertium-based machine translation system for a new pair of languages. |
<code>apertium-tagger-training-tools</code> is an open-source package that can be used to train (in an unsupervised way) hidden-Markov-model-based part-of-speech taggers involved in machine translation. While training information, not only from the source language, but also from the target language is used; to this end the Apertium MT toolbox is used. After training a file containing the hidden-Markov-model parameters is produced; this file can be directly used within the Apertium MT toolbox. This package may simplify the initial building of an Apertium-based machine translation system for a new pair of languages. |
||
'''Acknowledgements''': This software has been funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711. |
'''Acknowledgements''': This software has been funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711. |
||
Line 13: | Line 13: | ||
Automatic shallow-transfer rules generation from parallel corpora |
Automatic shallow-transfer rules generation from parallel corpora |
||
apertium-transfer-tools is an open-source package consisting of a set of tools that allow the automatic generation of Apertium (level 1) transfer rules from parallel corpora. Transfer rules are generated from alignment templates, like those used in statistical machine translation, that have been extracted from parallel corpora and extended with a set of restrictions controlling their application. |
<code>apertium-transfer-tools</code> is an open-source package consisting of a set of tools that allow the automatic generation of Apertium (level 1) transfer rules from parallel corpora. Transfer rules are generated from alignment templates, like those used in statistical machine translation, that have been extracted from parallel corpora and extended with a set of restrictions controlling their application. |
||
The generated transfer rules (in XML format) can be directly used by the Apertium MT platform. Although this package is aimed at the generation of Apertium transfer rules it can be adapted to generate shallow-transfer rules for other MT platforms; moreover, some of the tools it provides can be used for other purposes such as the extraction of bilingual phrase pairs or the symmetrization of previously computed alignments. This package depends on the GIZA++ toolkit to compute word alignments; nevertheless, it can be easily adapted to use other aligners. |
The generated transfer rules (in XML format) can be directly used by the Apertium MT platform. Although this package is aimed at the generation of Apertium transfer rules it can be adapted to generate shallow-transfer rules for other MT platforms; moreover, some of the tools it provides can be used for other purposes such as the extraction of bilingual phrase pairs or the symmetrization of previously computed alignments. This package depends on the GIZA++ toolkit to compute word alignments; nevertheless, it can be easily adapted to use other aligners. |
||
Line 24: | Line 24: | ||
Evaluation of Apertium-based MT systems |
Evaluation of Apertium-based MT systems |
||
apertium-eval-translator is a very simple script written in Perl. It calculates the word error rate (WER) and the position-independent word error rate (PER) between a translation performed by an Apertium-based MT system and its human-corrected translation at document level. Although it has been designed to evaluate Apertium-based systems, it can be easily adapted to evaluate other MT systems. |
<code>apertium-eval-translator</code> is a very simple script written in Perl. It calculates the word error rate (WER) and the position-independent word error rate (PER) between a translation performed by an Apertium-based MT system and its human-corrected translation at document level. Although it has been designed to evaluate Apertium-based systems, it can be easily adapted to evaluate other MT systems. |
||
'''Acknowledgements''': Funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711. |
'''Acknowledgements''': Funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711. |
||
Line 33: | Line 33: | ||
dictionary look-up for mobile devices |
dictionary look-up for mobile devices |
||
apertium-tinylex is an open-source J2ME (Java 2 Micro Edition) program that can be used to translate words. The program is currently available for mobile devices supporting MIDP 2.0 and there is already a (beta 0.2) version for almost all language pairs in Apertium. It is intended to add further features in future releases. |
<code>apertium-tinylex</code> is an open-source J2ME (Java 2 Micro Edition) program that can be used to translate words. The program is currently available for mobile devices supporting MIDP 2.0 and there is already a (beta 0.2) version for almost all language pairs in Apertium. It is intended to add further features in future releases. |
||
[http://www.tinylex.com/ www.tinylex.com] |
[http://www.tinylex.com/ www.tinylex.com] |
Revision as of 15:47, 5 February 2009
apertium-tagger-training-tools
Target-language driven part-of-speech tagger trainer
apertium-tagger-training-tools
is an open-source package that can be used to train (in an unsupervised way) hidden-Markov-model-based part-of-speech taggers involved in machine translation. While training information, not only from the source language, but also from the target language is used; to this end the Apertium MT toolbox is used. After training a file containing the hidden-Markov-model parameters is produced; this file can be directly used within the Apertium MT toolbox. This package may simplify the initial building of an Apertium-based machine translation system for a new pair of languages.
Acknowledgements: This software has been funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711.
apertium-transfer-tools
Automatic shallow-transfer rules generation from parallel corpora
apertium-transfer-tools
is an open-source package consisting of a set of tools that allow the automatic generation of Apertium (level 1) transfer rules from parallel corpora. Transfer rules are generated from alignment templates, like those used in statistical machine translation, that have been extracted from parallel corpora and extended with a set of restrictions controlling their application.
The generated transfer rules (in XML format) can be directly used by the Apertium MT platform. Although this package is aimed at the generation of Apertium transfer rules it can be adapted to generate shallow-transfer rules for other MT platforms; moreover, some of the tools it provides can be used for other purposes such as the extraction of bilingual phrase pairs or the symmetrization of previously computed alignments. This package depends on the GIZA++ toolkit to compute word alignments; nevertheless, it can be easily adapted to use other aligners.
Acknowledgements: Funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711.
apertium-eval-translator
Evaluation of Apertium-based MT systems
apertium-eval-translator
is a very simple script written in Perl. It calculates the word error rate (WER) and the position-independent word error rate (PER) between a translation performed by an Apertium-based MT system and its human-corrected translation at document level. Although it has been designed to evaluate Apertium-based systems, it can be easily adapted to evaluate other MT systems.
Acknowledgements: Funded by the Spanish Ministry of Science and Technology through project TIC2003-08681-C02-01, and by the Spanish Ministry of Education and Science and the European Social Found through research grant BES-2004-4711.
apertium-tinyLex
dictionary look-up for mobile devices
apertium-tinylex
is an open-source J2ME (Java 2 Micro Edition) program that can be used to translate words. The program is currently available for mobile devices supporting MIDP 2.0 and there is already a (beta 0.2) version for almost all language pairs in Apertium. It is intended to add further features in future releases.