Difference between revisions of "ReTraTos"
Jump to navigation
Jump to search
Line 54: | Line 54: | ||
==External links== |
==External links== |
||
* [http:// |
* [http://www.nilc.icmc.usp.br/nilc/projects/retratos.htm ReTraTos: Homepage] |
||
* [http://retratos.svn.sourceforge.net/viewvc/retratos/ ReTraTos: SVN] |
* [http://retratos.svn.sourceforge.net/viewvc/retratos/ ReTraTos: SVN] |
||
Revision as of 19:28, 15 March 2008
ReTraTos is a toolbox to build linguistic resources useful for machine translation (MT): bilingual dictionaries and transfer rules. The induction systems and open linguistic data can be used with the Apertium toolbox to build open-source MT systems.
Input format
The input sentences need to be given in two separate files, for example en.txt
for English and pt.txt
for Portuguese.
- pt.txt
<s snum=1>Os/O<det><def><m><pl>:1 alunos/aluno<n><m><pl>:2 do/de<pr>+o<det><def><m><sg>:3_4 colégio/colégio<n><m><sg>:5 ...
- en.txt
<s snum=1>The/The<det><def><sp>:1 students/student<n><pl>:2 of/of<pr>:3 the/the<det><def><sp>:3 school/school<n><pl>:5 ...
Usage
ReTraTos_lex
You will need the header and footer of a bilingual dictionary in two separate files, for example, dic_header.txt
and dic_footer.txt
(see the examples in the package). Example sentences, in the format described above will be in the files en.txt
and pl.txt
.
$ ReTraTos_lex -s pt.txt -t en.txt -b dic_header.txt -e dic_footer.txt PRE-PROCESSAMENTO Reading the examples ... 100 examples read Reading the examples ... 100 examples read GERANDO LEXICO Generating source-target dictionary ... OK Generating target-source dictionary ... OK Processing bilingual dictionary ... OK Generalizing bilingual dictionary ... OK Cleaning equal attributes ... OK IMPRIMINDO LEXICO Printing bilingual dictionary ... OK
The output file will be ReTraTos_lex_ptXen_1.dix
.
See also
External links
Further reading
- Helena M. Caseli, Maria das Graças V. Nunes, Mikel L. Forcada. (2008) "From free shallow monolingual resources to machine translation systems: easing the task", in Mixing Approaches To Machine Translation, MATMT2008, proceedings (Donostia, Spain, Feb. 14, 2008), pp. 41--48
- Helena M. Caseli, Maria das Graças V. Nunes, Mikel L. Forcada. (2008) "Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation". Machine Translation (to appear)