Search results

Jump to navigation Jump to search

Page title matches

  • ...sources on the language as you can find. It should also list any Apertium resources that already exist for the language. To create a page on the wiki, you will need to request an account from one of your mentors, providing them with an email address and your preferred usern
    1 KB (202 words) - 19:55, 12 April 2021

Page text matches

  • ==Specific resources per language== ...might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or
    10 KB (1,336 words) - 20:40, 11 December 2019
  • ...ces Document''' (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.). ...ing_a_Linguistic_Resources_Document|apertium-crossdics]] to indicate which resources (dictionaries and cross models) can be crossed.
    8 KB (902 words) - 09:19, 6 October 2014
  • ...at will improve your knowledge of Apertium and help you get into the world of open-source development. ...st further information. The time column gives the minimum estimated amount of time that should be spent on the task. '''It does not include time taken to
    187 KB (21,006 words) - 22:14, 12 November 2012
  • ...ain at least a default pattern-action. Here you have a very simple example of '''cross-model-en-es-gl.xml''' document. === Linguistic Resources Document ===
    6 KB (689 words) - 22:58, 25 October 2018
  • '''Crossdics''' (part of [[apertium-dixtools]]) is a program that can be used to "cross" language pa === Using a Linguistic Resources Document ===
    5 KB (633 words) - 13:29, 6 October 2017
  • This is a non-comprehensive list of publications involving Apertium, ordered by date. Please feel free to add y ...ased Shallow Machine Translation for WMT 2019 Shared Task]. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers,
    33 KB (4,418 words) - 11:52, 29 December 2021
  • ...show how you can make an "indirect" contribution, by documenting language resources, helping us to build bilingual test sets, translating, promoting, etc. * How to catalog resources.
    9 KB (1,494 words) - 05:58, 18 March 2015
  • ...sources on the language as you can find. It should also list any Apertium resources that already exist for the language. To create a page on the wiki, you will need to request an account from one of your mentors, providing them with an email address and your preferred usern
    1 KB (202 words) - 19:55, 12 April 2021
  • == Document Available Resources == ...various resources (even seemingly reputable ones!) or over-all low-quality resources.
    10 KB (1,615 words) - 07:43, 20 December 2015
  • ...ge is a translation from Spanish into English of most of the documentation of [[Matxin]] with some minor ...on can be found [http://matxin.svn.sourceforge.net/viewvc/matxin/trunk/doc/documentation-es.pdf here]. This page does not describe how to install or use Matxin, for
    58 KB (8,964 words) - 11:11, 14 May 2016
  • #*Find out if all nouns can take these (in theory), or only certain classes of nouns. ...o take an i- prefix or a ø- null prefix, this should be included into each of the class files. (E.g. strong_double_mid_radical.py)
    8 KB (1,159 words) - 04:41, 9 March 2018
  • | [[Linguistic Resources Document]] | [[List of symbols]]
    13 KB (1,601 words) - 23:31, 23 July 2021
  • {{see-also|Incubator|Specific resources per language}} [[Specific resources per language]] also contains lists of resources and possible places to get started making analysers, and even whole systems
    1 KB (164 words) - 05:20, 4 December 2019
  • ...at will improve your knowledge of Apertium and help you get into the world of open-source development. ...equest further information. All tasks are 2 hours maximum estimated amount of time that would be spent on the task by an experienced developer, however:
    14 KB (2,007 words) - 03:06, 27 October 2013
  • [[Documentation (français)|En français]] ...rtium.org/w/images/d/d0/Apertium2-documentation.pdf Apertium 2.0: Official documentation] (222 pages)'''
    5 KB (599 words) - 19:04, 17 February 2020
  • [[Resources|In English]] * [[Specific resources per language]] (en anglais, car d'une part une traduction serait peu utile,
    1 KB (185 words) - 13:45, 7 October 2014
  • University: International Institute of Information Technology ...a lack of a good quality machine translation service. There are hardly any resources for most Indian languages and the work Apertium does manages to counter thi
    9 KB (1,391 words) - 16:41, 31 March 2020
  • ...at will improve your knowledge of Apertium and help you get into the world of open-source development. ...equest further information. All tasks are 2 hours maximum estimated amount of time that would be spent on the task by an experienced developer, however:
    68 KB (10,323 words) - 15:37, 25 October 2014
  • This page gives a brief overview to the kind of data and resources that can be useful in building a new language pair for Apertium, and how to * <code>apertium-en-af.af.dix.xml</code>: a list of Afrikaans words and their variants;
    13 KB (2,112 words) - 12:11, 26 May 2023
  • ...rphological analyser was the hardest task in the project and required most of the time. That being said, I am very pleased with the results we got. We used the very little grammar resources we had<ref>J. Aquilina (1994), Teach Yourself Maltese. [http://books.google
    7 KB (1,063 words) - 11:13, 26 August 2011
  • ==Resources== ...cts/aramorph/ AraMorph - Perl] - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter, see http://www
    3 KB (437 words) - 10:23, 21 November 2021
  • ...Swedish (<code>swe</code>). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium sys {{see-also|List of dictionaries}}
    5 KB (608 words) - 11:55, 9 November 2022
  • ...pair in Apertium. It gets a bit technical – if you just want to notify us of some errors or pass along a word list, please see [[Contributing]]. If you ...ta for the part-of-speech tagger, which is in charge of the disambiguation of the source language text.
    50 KB (7,915 words) - 00:04, 10 March 2019
  • This page intends to give a step-by-step walk-through of how to create a new translator in the [[Matxin]] platform. ...ourceforge.net Matxin homepage]. This page will only focus on the creation of a new language pair, and will avoid theoretical and methodological issues.
    26 KB (4,167 words) - 13:05, 11 May 2016
  • ...tyle directory structure manually and use scripts to checkout the contents of individual parts. If you have not used git before, or are uncomfortable with using git, these resources can help you:
    8 KB (1,173 words) - 20:16, 10 March 2018
  • ...working on language maintenance and shift. I'm very interested in creating resources for minoritised languages. *Because there is lot of good work done and being done in it.
    16 KB (2,285 words) - 06:46, 12 April 2019
  • # linguistic data for a growing number of languages and language pairs # Be appropriate: Demonstrate you have a knowledge of Apertium, how it works and the problem it has that you'd like to solve.
    10 KB (1,500 words) - 16:23, 18 February 2016
  • '''ReTraTos''' is a toolbox to build linguistic resources useful for machine translation (MT): bilingual dictionaries and transfer ru * An aligned [[corpus]] of the two languages. For any pair of european languages, the JRC-Acquis corpus is recommended.
    8 KB (1,273 words) - 09:32, 3 May 2024
  • ...languages are related with varying levels of mutual intelligibility. Many of these languages are included in Apertium already. ...languages. These can then be paired for X→Y translation with the addition of a CG for language X and transfer rules / dictionary for the pair X→Y. Bel
    18 KB (2,312 words) - 18:25, 18 September 2016
  • ...C proposals/Allow some code under github.com/apertium]]) met with a number of objections and eventually expired. This proposal attempts to address those :The opportunity of any of two changes could also be examinated separately.
    22 KB (3,325 words) - 14:06, 12 March 2018
  • ...nslation in an Ajax application, in an IM client, in a Translation Service of a large IT service-oriented and geographically distributed infrastructure ( ...ce interface) to implement real-time translation (bot in input and output) of instant messages.]]
    24 KB (3,572 words) - 07:37, 8 March 2018
  • ...ted in working on Apertium, but not know where to start. We have lots of [[documentation]], but sometimes what you really want to do is sit down and have a chat and If you decide to contact one of our mentors, the best way would be either on [[IRC]] (best), the [[Contact|
    9 KB (1,164 words) - 15:01, 1 April 2021
  • <li>The mission of the Apertium project is to collaboratively develop free/open-source machine <li>To favour the interchange and reuse of existing linguistic data.</li>
    9 KB (1,356 words) - 18:34, 3 March 2018
  • <li>The mission of the Apertium project is to collaboratively develop free/open-source machine <li>To favour the interchange and reuse of existing linguistic data.</li>
    8 KB (1,214 words) - 22:30, 3 August 2013
  • <li>The mission of the Apertium project is to collaboratively develop free/open-source machine <li>To favour the interchange and reuse of existing linguistic data.</li>
    8 KB (1,215 words) - 18:14, 3 March 2018
  • then ReTraTos/Giza++ on the KDE4 corpus of .po files, and some by correct (these were of course reported "upstream").
    12 KB (1,886 words) - 12:20, 20 June 2019
  • ...systems for marginalised languages. The general idea is to use the output of various MT systems to produce one "better" translation. The "baseline" woul * Make maximum usage of available resources for marginalised languages. Parallel corpora, user-feedback, other translat
    5 KB (802 words) - 07:04, 10 May 2012
  • Apertium is a piece of software, a program, for translating one language to another. The central program of Apertium is what is called an 'engine'. It only talks in computer code, or
    17 KB (2,835 words) - 16:16, 24 January 2017
  • ...ss that is shared by all three methods, and then continue to describe each of the individual methods separately. ...training for, and at least up until pretransfer in the opposite direction. Of course, you also need a parallel corpus for this method (see [[Running the
    14 KB (2,181 words) - 19:01, 17 August 2018
  • There are many ways to contribute to Apertium, from sending us lists of words or phrases you find that are incorrectly translated, to getting invol ...ation rule-based machine translation]'' toolchain and ecosystem, with many of our tools based on [https://en.wikipedia.org/wiki/Finite-state_transducer f
    7 KB (1,139 words) - 06:27, 27 May 2021
  • There are many ways to contribute to Apertium, from sending us lists of words or phrases you find that are incorrectly translated, to getting invol ...rge.net/lists/listinfo/apertium-stuff apertium-stuff], which is where most of the discussion goes on. Also, come and idle on the [[IRC|IRC channel]] <cod
    3 KB (549 words) - 09:17, 26 May 2021
  • ...u can use the Makefile scripts provided in the [[#Makefiles|last section]] of this page. Take your corpus and make a tagged version of it:
    12 KB (1,634 words) - 18:26, 26 September 2016
  • ...tp://www.dlsi.ua.es/~mlf/docum/caseli08p.pdf From free shallow monolingual resources to machine translation systems: easing the task]", in ''Mixing Approaches T ...raças V. Nunes, Mikel L. Forcada. (2008) "Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine tran
    8 KB (1,301 words) - 09:43, 6 October 2014
  • ...ting broken commits, and excellent merge/rebase capabilities. The built-in documentation is very good, and (unlike svn) the command line git command comes in glorio ...urrent method of allowing people to commit directly, but retain the option of using pull requests for those who don't plan to contribute regularly. Sourc
    7 KB (1,057 words) - 03:39, 23 February 2015
  • ...Apertium machine translation system from scratch. You can check the [[list of language pairs]] that have already been started. ...ssume any knowledge of linguistics, or machine translation above the level of being able to distinguish nouns from verbs (and prepositions etc.) You shou
    19 KB (3,164 words) - 20:58, 2 April 2021
  • ...e "..". Autotools is looking for an auxiliary file in the parent directory of this one. This will happen if you check out one project folder inside anoth First find the location of this file (it should be in <code>$(PREFIX)/lib/pkgconfig</code>) and then r
    20 KB (3,153 words) - 08:13, 24 May 2019
  • ...n total, it is worth training the tagger. To do this, you'll need a couple of things, a decent sized corpus, either tagged or untagged, and a <code>.tsx< ...de>tagger.dtd</code>, although it is probably easier to take a look at one of the pre-written ones in other language pairs.
    7 KB (1,058 words) - 07:37, 4 July 2016
  • ...rs that are likely to end sentences in single quotation marks as arguments of the <code>sent_end_chars()</code> function above (second line). Depending o * The documentation for the two aforementioned booleans can be found [http://nltk.org/_modules/
    14 KB (2,232 words) - 12:51, 26 September 2018
  • ...is a HTML5/[https://d3js.org/ D3.js] tool that depicts all Apertium [[list of language pairs|language pairs]] in an interactive graph initially developed ...rthwash/Apertium-Global-PairViewer Global-Pairviewer] is an implementation of Pairviewer with a three-dimensional globe. See [https://wikis.swarthmore.ed
    5 KB (702 words) - 01:34, 9 December 2018
  • ...n for Machine Translation in the Americas: Workshop on Technologies for MT of Low Resource Languages (LoResMT @ AMTA 2018), Boston, MA. ...shington's presentation to educators in Kyrgyzstan at the Open Educational Resources summer camp (=retreat) in Kyrgyzstan: [http://open.edu.kg/KY/oer-summer-cam
    4 KB (435 words) - 16:26, 1 July 2018
  • :* Create a alternative form to edit dix files with GUI resources. ...lop, initially, monolingual dictionaries but keeping the particular format of each file.
    29 KB (4,382 words) - 07:53, 6 October 2019
  • This page aims to give an overview of the ''quality'' of various translators available in the Apertium platform. ...tory – see [[Evaluation]] for some discussion of strengths and limitations of WER/BLEU.
    9 KB (1,233 words) - 09:10, 21 November 2021
  • ...all programs in the pipeline for every single request would lead to a lot of lag. ...ion, see [[Apertium services]]. This page gives details on how to use some of the same techniques yourself.
    13 KB (2,039 words) - 11:56, 3 June 2022
  • * Google Summer of Code (2009, første utgåve, v0.6.0) Number of words in reference: 3750
    23 KB (3,704 words) - 11:56, 16 December 2020
  • ...temporary files to emulate it: <code>ls|more</code> becomes the equivalent of <code>ls&gt;tmp;more&lt;tmp</code></ref>. A solution can be using <code>bas * Specify the URL of the repository: http://apertium.svn.sourceforge.net/svnroot/apertium and cl
    12 KB (1,883 words) - 22:06, 7 March 2018
  • ...yser/generator ([[Romanian#Apertium-ron|apertium-ron]]) and as a component of several pairs which translate to/from Romanian. * Niculescu, Alexandru.&nbsp;''Outline history of the Romanian language''. Editura Științifică și Enciclopedică, 1981.
    7 KB (889 words) - 09:53, 28 November 2018
  • We all do similar things when making dictionaries. Make a load of scripts that we hack to do a specific job, then throw them away a the end. ;When we're making a new dictionary, what resources do we have and use ?
    3 KB (565 words) - 11:46, 24 March 2012
  • ...languages. GCI gives us a chance to get in touch with the next generation of speakers, and to show them how they can help their languages develop and gi ...t GCIs and some of whom are past GSOC students. We've tried to get a range of mentors in different time zones, so our time zone spreads from UTC-6 to UTC
    3 KB (516 words) - 21:03, 29 October 2016
  • ...languages. GCI gives us a chance to get in touch with the next generation of speakers, and to show them how they can help their languages develop and gi ...t GCIs and some of whom are past GSOC students. We've tried to get a range of mentors in different time zones, so our time zone spreads from UTC-6 to UTC
    3 KB (443 words) - 11:20, 11 September 2018
  • ...languages. GCI gives us a chance to get in touch with the next generation of speakers, and to show them how they can help their languages develop and gi ...t GCIs and some of whom are past GSOC students. We've tried to get a range of mentors in different time zones, so our time zone spreads from UTC-6 to UTC
    2 KB (421 words) - 15:37, 10 October 2017
  • '''hunmorph''' is an set of programs for making morphological analysers and generators written largely ...ocamorph binary in <code>wrappers/ocamorph</code>. Now go back to the root of your CVS tree.
    4 KB (672 words) - 15:12, 8 July 2012
  • ...), producing a text corpus, which is useful for training unsupervised part-of-speech taggers, n-gram language models, etc. It was modified by a number of people, including by BenStobaugh during Google Code-In 2013, and can be clo
    2 KB (360 words) - 18:55, 30 January 2023
  • ::The English → Urdu translation system linked [[Specific resources per language#Urdu|here]] seems to use LFG and Earley-based parsing. * In case there is more than one parse of a sentence, there should be a way to select the most likely.
    2 KB (236 words) - 21:21, 2 October 2013
  • Lists of corpora under free licences (public domain, CC-BY-SA, GPL, etc.). ...sh; http://www.open-tran.eu &mdash; single point of access to translations of open-source software in many languages (downloadable as SQLite databases)
    5 KB (746 words) - 20:36, 25 January 2020
  • ...rallel corpora can be a difficult process and for some language pairs such resources might not exist. However, we can use a language model for the target langua The full documentation can be viewed [http://sourceforge.net/apps/mediawiki/irstlm/index.php?title
    3 KB (364 words) - 23:25, 23 August 2012
  • | <s>Find resources for improving the bilingual dictionary. Work on expanding the bil-Dictionar * ''Checkpoint: Measure progress of the project, and discuss the feasibility of working on Rus -> Ces. Final check on previously composed transfer rules fr
    4 KB (521 words) - 05:48, 5 June 2017
  • ...gawiki.org/downloads/omegawiki-lexical.sql.gz download] the latest version of lexical data from the OmegaWiki database (see also [http://www.omegawiki.or [[Category:Resources]]
    2 KB (202 words) - 00:55, 24 January 2018
  • ...ribute this file". Furthermore, the dictionary appears to include elements of copyrighted dictionaries under free licences (such as Wiktionary), as well [[Category:Resources]]
    599 bytes (87 words) - 16:41, 26 September 2016
  • * [[Specific resources per language]] * [[List of language pairs]]
    939 bytes (138 words) - 03:55, 9 March 2018
  • [[Using linguistic resources|In English]] ...tilisation dans Apertium, et il faudra un plus ou moins grand ''*** amount of revision *** (à traduire)'' d'abord. Vous n'avez pas besoin de parler trè
    15 KB (2,534 words) - 12:10, 6 October 2014
  • There are also dumps of the articles translated with the Content Translation tool, which uses Apert [[Category:Resources]]
    3 KB (436 words) - 05:40, 10 April 2019