Search results

Jump to navigation Jump to search
  • ====When running configure script for language pair data==== ====Workaround when language pairs need updated configure.ac's====
    20 KB (3,153 words) - 08:13, 24 May 2019
  • DATA=/home/philip/Apertium/gsoc2013/monolingual/data ...atterns-frac-maxent.py $DATA/setimes.sh-mk.freq $DATA/setimes.sh-mk.ambig $DATA/setimes.sh-mk.annotated > events 2>ngrams
    3 KB (520 words) - 21:25, 14 February 2014
  • ...to be translated. For example, HTML tags must not be translated in another language, but only the text of the Web page. ...e same software are used for every language pairs. It is the format of the data to be translated which will take to use a particular deformatter.
    58 KB (8,365 words) - 20:16, 26 June 2018
  • Owing to the different syntactic structure of the phrases in each language, some Although the details of the modules and the linguistic data is presented in
    58 KB (8,964 words) - 11:11, 14 May 2016
  • ...Iberian peninsula, but is now being used to translate between more distant language pairs. ...ngineering ([http://www.prompsit.com http://www.prompsit.com]). Linguistic data are being developed by Transducens, the Seminario
    26 KB (3,122 words) - 06:25, 27 May 2021
  • ...ngsnes (ed.) Bauta: Janne Bondi Johannessen in memoriam, Oslo Studies in Language 11(2), 2020. 489–501. (ISSN 1890-9639 / ISBN 978-82-91398-12-9) ...system/files/swj1419.pdf The apertium bilingual dictionaries on the web of data]. Semantic Web, 9(2), 231-240.
    33 KB (4,418 words) - 11:52, 29 December 2021
  • ...tion of each module with more precision. They may also introduce technical language which linguists and/or computer coders would use. The technical description References to 'xxx' and 'yyy' refer to a language code, for example 'en-es'; 'English' to 'Spanish'.
    29 KB (4,687 words) - 16:28, 5 June 2020
  • ...of any language in Russia in areas smaller than the Federal Subjects. The data is in Russian and comes from the official 2010 Russian Census website. Here are the steps to access the data:
    2 KB (296 words) - 21:12, 13 January 2018
  • ...//d3js.org/ D3.js] tool that depicts all Apertium [[list of language pairs|language pairs]] in an interactive graph initially developed sometime before the [[G === Updating language data by scraping ===
    5 KB (702 words) - 01:34, 9 December 2018
  • === Language pairs === .../github.com/apertium/apertium-urd-hin?files=1 apertium-urd-hin] Linguistic data for the Apertium Urdu-Hindi machine translator
    6 KB (806 words) - 00:45, 7 December 2018
  • '''Apertium New Language Pair HOWTO''' ...rtium machine translation system from scratch. You can check the [[list of language pairs]] that have already been started.
    36 KB (5,933 words) - 16:14, 22 February 2021
  • The number of language pairs in development for Apertium is increasing, and so is the complexity o language pairs. With better tools, more people will be able to develop language pairs.
    29 KB (4,382 words) - 07:53, 6 October 2019
  • ...he implementation of the algorithms must be free/open-source, but also the data themselves. Nowadays, there are many machine translation packages of this t ...morphologically rich languages, which even with large corpora suffer from data sparseness.
    6 KB (905 words) - 17:26, 18 October 2010
  • ...-supervised.make this one] from en-eo. You will need modify it to fit your language pair. This usually means editing the first few lines. ===Tagger data directory===
    3 KB (537 words) - 13:44, 18 June 2014
  • |Language You will need to install NLTK and NLTK data. Unfortunately, they both only support Python versions 2.6-2.7. If you are
    14 KB (2,232 words) - 12:51, 26 September 2018
  • ...uide on how to use a development version of Apertium to make a change in a language pair. ...ou should try this to make sure things work before you move on to whatever language pair you plan on working on.
    10 KB (1,626 words) - 17:46, 13 January 2020
  • ...http://wiki.apertium.org/wiki/Mandarin_Chinese#In_Apertium some linguistic data in Apertium]. ...fers to the most commonly spoken form of Chinese that is the sole official language of China and Taiwan. It is also known as Putonghua or Standard Chinese ([[W
    16 KB (2,148 words) - 03:28, 16 December 2015
  • ...mpire, as did all Romance languages. There are currently 4 released French language pairs ...the sixth most spoken language in the world and is the second most studied language worldwide.
    15 KB (2,081 words) - 07:14, 12 August 2020
  • ...MT based on corpora: adding new languages ​​is very easy. To create a new language pair, in fact, it is not necessary to include corpora with millions of word ...airs can be added by creating dictionaries and rules containing linguistic data in XML format.
    15 KB (2,339 words) - 00:41, 4 June 2018
  • ...ind that are incorrectly translated, to getting involved in creating a new language pair or programming on tools or user interfaces. Here are some question fre Our language agnostic tools are native and written in [https://en.wikipedia.org/wiki/C++
    7 KB (1,139 words) - 06:27, 27 May 2021
  • ...are basically for Anel, Aizhan and Assem who have started to develop this language pair... And Aida too... === Download apertium, lttoolbox and eng-kaz data from SVN ===
    20 KB (2,856 words) - 06:26, 27 May 2021
  • ...ll these language pairs. This means that the data can be re-used by other language projects (e.g. in developing spelling or grammar checkers, thesauri, etc). This project was accepted as part of our "adopt a language pair" idea
    12 KB (1,917 words) - 15:54, 12 September 2009
  • *'''langpair''': language pair to use for translation curl -G --data "langpair=eng|spa&q=run" http://localhost:2737/dictionaryLookup
    5 KB (712 words) - 21:27, 16 August 2016
  • ...appear at the beginning of a sentence. The unique thing about the persian language though, is that they use prepositions which is quite uncommon in many SOV l ...designed a Two-sided morphology analyst of nouns and adjectives in Persian language, using Xerox Finite State Technology as giving input word (adjective or nou
    16 KB (2,597 words) - 20:58, 12 January 2013
  • * Apertium language pairs .../engine of Apertium installed (including the requirement lttoolbox, but no language pairs yet).
    9 KB (1,367 words) - 09:17, 26 May 2021
  • ...of the main five data files in any language pair (see also: [[Apertium New Language Pair HOWTO]]). ....dix'' where ''apertium-A-B'' is the name of the [[List of language pairs| language pair]]. For example file ''apertium-af-nl.af-nl.dix'' is the bilingual dict
    7 KB (1,244 words) - 16:41, 17 March 2018
  • ...getting new contributors to Apertium and to helping spread our passion for language technology. ...of other things, live in our '''[[subversion|svn repo]]'''. The language data is found in the following places:
    7 KB (1,091 words) - 19:54, 12 April 2021
  • ...olving the antecedent of the anaphors in text becomes essential in several language pairs. ...ge it to the correct anaphor''' using a macro in the transfer rules of the language pair. (t1x)
    20 KB (3,107 words) - 21:13, 24 June 2022
  • ...nders , specially for Indian Languages because we still do not have enough data ...oreign languages. I am specially interested in MT systems where the source language is English and the target languages are Indian Languages. It is impossible
    6 KB (923 words) - 17:57, 3 April 2010
  • First, make a directory called <code><lang>-tagger-data</code>. Put your corpus into there with a name like <code><lang>.crp.txt</c ...cifies how to generate the probability file. You can grab one from another language package. For <code>apertium-en-af</code> I took the Makefile from <code>ape
    7 KB (1,177 words) - 08:34, 8 October 2014
  • '''Track:''' Data Science Dynamic Language Interpreter implementation
    8 KB (1,094 words) - 13:10, 14 April 2019
  • ==Install language module== A language module supporting spelling may be installed, either from our repository, or
    3 KB (387 words) - 12:21, 26 September 2016
  • ...ion of machine translation. The tasks consist of sentences in the original language, reference translation with keywords omitted and the machine translation of ...various { gap } in order to discover phenomena and patterns in the natural language.
    9 KB (1,368 words) - 09:04, 23 April 2015
  • ...duce translations which are less fluent, but more preserving of the source language meaning. ...er and number between a determiner and head noun will remain in the target language output.
    12 KB (1,464 words) - 12:00, 31 January 2012
  • ...duce translations which are less fluent, but more preserving of the source language meaning. ...er and number between a determiner and head noun will remain in the target language output.
    11 KB (1,519 words) - 06:51, 11 May 2013
  • ...duce translations which are less fluent, but more preserving of the source language meaning. ...er and number between a determiner and head noun will remain in the target language output.
    11 KB (1,519 words) - 18:27, 16 October 2015
  • More convincing if you have a language pair on the computer somewhere :) ...this should work for both packaged and compiled Apertium. Without language data you can't see a translation, but you can see the help. Try,
    2 KB (368 words) - 06:02, 24 April 2017
  • ...probably try this to make sure things work before you move on to whatever language pair you plan on working on. Note that some existing language pairs have external dependencies, like HFST or Constraint Grammar. The [[In
    10 KB (1,715 words) - 12:29, 28 May 2018
  • ...tended to show how you can make an "indirect" contribution, by documenting language resources, helping us to build bilingual test sets, translating, promoting, ...first language, and translate them to the other. A translation in a third language may be useful in enlisting help, but is not required.
    9 KB (1,494 words) - 05:58, 18 March 2015
  • ...ed translation, morphological analysis, natural language processing, human language technologies ...Spanish–Catalan) but which has been expanded to deal with more divergent language pairs (such as English-Catalan and even Basque→English). The platform pro
    10 KB (1,500 words) - 16:23, 18 February 2016
  • ...probably just search for, tick off and install Apertium and your favorite language pairs in Synaptic. There's a friendly [https://help.ubuntu.com/community/Sy Step 2: '''Download apertium, lttoolbox and language pairs from SVN.'''
    3 KB (475 words) - 16:28, 27 April 2017
  • '''apertium-get''' is a little script to fetch and compile language data, with monolingual dependencies, from Github. ...d and compiled by just going to the directory where you want your language data to be, and running
    2 KB (317 words) - 20:45, 23 March 2019
  • ==== Data preparation ==== There were three attempts to extract postediting operations for each language pair: with threshold = 0.8 and -m, -M = (1, 3).
    7 KB (1,033 words) - 15:27, 15 August 2018
  • <li>- 4: preprocessing : dictionary data needs some changes to be used in a graph, this step prepares it for further ...recommends what languages will be the most efficient to enrich particular language pair</li>
    19 KB (2,541 words) - 15:44, 12 August 2018
  • ...d was exposed to different languages. This led to me being fascinated with language translation and I wanted to contribute to help in making communication easi I am going to work on “ Adopt an unreleased language pair: Hindi - Telugu”. I want to get the pair released in both the direct
    9 KB (1,391 words) - 16:41, 31 March 2020
  • == Language data packages == If you've installed tools with install-nightly.sh, you can install language data with
    4 KB (665 words) - 11:57, 18 November 2022
  • ...um project is a project which works on open-source machine translation and language technology. We try and focus our efforts on lesser-resourced and marginalis ...versitat d'Alacant] (Alacant, Spain) and [http://www.prompsit.com Prompsit Language Engineering].
    10 KB (1,543 words) - 19:50, 12 April 2021
  • ...f language pairs that may be used to infer new entries for existing or new language pairs using graphs. ...a graph and relevant information is stated about them. The cloud of linked data is intended to be navigated by software agents primarily. In the case of Ap
    3 KB (452 words) - 19:50, 24 March 2020
  • ...oject goal is to create a machine translation package for Sicilian-Spanish language pair on the base of Apertium’s machine translation system. This project i ...he Sicilian dictionary was the abundance of spelling forms in the Sicilian language. For instance, one Sicilian verb with the meaning 'to join' can have the fo
    9 KB (1,370 words) - 13:58, 23 August 2016
  • ...language particularly suitable for various reasons. First, because it is a language in process of standardization, so both the linguistic resources (written do ...he near future, it will be possible to operate in the translation of other language pairs as Sardinian-Catalan and Sardinian-Spanish.
    7 KB (1,110 words) - 11:34, 23 August 2016
  • ...declarative language. A good intro would be to look through [[Apertium New Language Pair HOWTO]], see also [[Contributing to an existing pair]]. If the pair ha #* If there is no translation, translate it into the languages of your language pair first.
    6 KB (1,024 words) - 15:22, 20 April 2021
  • ...rs independent free-software developers. There are currently 40 published language pairs within the project (including a number of "firsts" — for example Sp natural language processing, machine translation, grammar, python, c++, linguistics, languag
    7 KB (1,111 words) - 10:10, 15 November 2015
  • ==Install language module== * To install Kazakh language module, first get it
    4 KB (492 words) - 02:54, 10 March 2018
  • You can replace cy-en by different language pair. For the list of language pairs go [http://wiki.apertium.org/wiki/List_of_language_pairs#Trunk_.28rel === Install language-pair data ===
    5 KB (808 words) - 02:48, 9 March 2018
  • 1. All needed data for North Sami, Kurmanji, Breton, Kazakh and English was prepared: there ar ...Also the testpack for two language pairs was built: it contains all needed data for sme-nob and kmr-eng, the labeller and installation script.
    5 KB (764 words) - 01:40, 8 March 2018
  • #* If you can't understand the language the website is written in, ask for help in IRC or use a translator and look ...er when calling <code>Writer()</code>. For example if we want to write the data every 30 seconds call <code>Writer(30)</code>.</li>
    14 KB (2,389 words) - 05:20, 29 March 2019
  • ...family of some three dozen related languages descended from a Proto-Uralic language and spoken by more than 25 million people throughout Europe and Northern As ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair.
    22 KB (2,520 words) - 23:09, 22 December 2014
  • ...e Summer of Code 2018. It also includes information on the upgrade of four language pairs which was carried out during the same period. For a more detailed wor ...tem and develop it to bring it to release quality. In addition, four other language pairs have been upgraded to the monolingual package system to ease future d
    7 KB (1,071 words) - 10:48, 14 August 2018
  • ...l be available. For various reasons, the author has successfully developed language pairs using public repository versions of Apertium core. ...tes and Apertium tools. You also get, for optional install; release-level language pairs, service providers, constraint grammar code, and more. All under pack
    6 KB (1,006 words) - 18:26, 27 April 2021
  • ...m project develops a free/open-source platform for machine translation and language technology. We try to focus our efforts on lesser-resourced and marginalise ...ped around the world, largely in universities and companies (e.g. Prompsit Language Engineering), but also independent free-software developers play a huge rol
    13 KB (2,013 words) - 12:21, 20 June 2019
  • ...m project develops a free/open-source platform for machine translation and language technology. We try and focus our efforts on lesser-resourced and marginalis ...ped around the world, largely in universities and companies (e.g. Prompsit Language Engineering), but also independent free-software developers play a huge rol
    11 KB (1,802 words) - 19:51, 12 April 2021
  • ===Download and compile data=== ...</code> and <code>apertium-is-en</code>. You can find others at: [[list of language pairs]] and [[list of dictionaries]].
    4 KB (647 words) - 07:45, 8 October 2014
  • ...dictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs. !rowspan=2| Language
    18 KB (2,312 words) - 18:25, 18 September 2016
  • ...) constitute a group of related languages and a branch of the Afro-Asiatic language family. Spoken by more than 470 million people throughout North Africa and ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair.
    20 KB (2,336 words) - 18:10, 14 April 2015
  • == Improving language pairs by mining MediaWiki Content Translation postedits == ...and bidix entries to improve the performance of an Apertium language pair. Data is available from Wikimedia content translation through an [API https://www
    3 KB (383 words) - 19:56, 24 March 2020
  • ...language, as Apertium offers the only machine translation system for this language pair. The idea is to make Occitan output easier to postedit and French outp ...guage data], [https://github.com/apertium/apertium-fra the French language data], and [https://github.com/apertium/apertium-oci-fra the Apertium Occitan-F
    2 KB (213 words) - 19:48, 24 March 2020
  • === Altai Language Resources === Crúbadán language data for Southern Altai. Kevin Scannell. 2015. The Crúbadán Project. oai:cruba
    2 KB (217 words) - 06:57, 5 December 2017
  • ...in some cases data or tools from Freeling could be useful to apertium, and data from apertium could be useful to Freeling. Also, to install the data, I had to change the lines in freeling/data/Makefile.am that looked like
    5 KB (720 words) - 02:20, 10 March 2018
  • ...Everything in Apertium is free/open source: engine, data for more than 29 language pairs and tools to translate at a speed of more than 20,000 words per secon === Useful data ===
    1 KB (175 words) - 14:19, 25 July 2012
  • (in this example, I use eng as language resp. eng-deu as pair) the file ./eng-tagger-data/eng.dic for some reasons is empty (has a file size of 0).
    1 KB (165 words) - 14:16, 28 August 2016
  • ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. ...ictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs.
    35 KB (3,577 words) - 15:24, 1 October 2021
  • ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. ...dictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs.
    22 KB (2,532 words) - 11:36, 30 July 2018
  • ...e>[http://www.ethnologue.com/subgroups/dravidian dra]</code>) constitute a language family of about 70 languages spoken primarily in South Asia. The four most ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair.
    19 KB (2,201 words) - 09:21, 9 December 2019
  • ...y aimed at related-language pairs but expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides * a language-independent machine translation engine
    776 bytes (114 words) - 19:07, 12 September 2018
  • ...on-months (four people, 18 months) to develop (both engine, and linguistic data). It was widely used, with thousands of requests per day. ...sh State to rewrite the code as open-source, and to convert the linguistic data. After one person year, the first version of the Spanish--Catalan translato
    12 KB (1,679 words) - 12:00, 31 January 2012
  • ...m project develops a free/open-source platform for machine translation and language technology. We try to focus our efforts on lesser-resourced and marginalise ...ped around the world, largely in universities and companies (e.g. Prompsit Language Engineering), but independent free-software developers also play a huge rol
    11 KB (1,680 words) - 12:22, 20 June 2019
  • ...on-months (four people, 18 months) to develop (both engine, and linguistic data). It was widely used, with thousands of requests per day. ...sh State to rewrite the code as open-source, and to convert the linguistic data. After one person year, the first version of the Spanish--Catalan translato
    12 KB (1,683 words) - 08:42, 10 May 2013
  • ...on-months (four people, 18 months) to develop (both engine, and linguistic data). It was widely used, with thousands of requests per day. ...sh State to rewrite the code as open-source, and to convert the linguistic data. After one person year, the first version of the Spanish--Catalan translato
    12 KB (1,683 words) - 11:00, 30 October 2015
  • ...ll the unigram models from “A set of open-source tools for Turkish natural language processing.”<ref name="trmorph-tools">http://coltekin.net/cagri/papers/tr ...tuff.”<ref name="prerequisites">[[Installation#If you want to add language data / do more advanced stuff]]</ref>
    20 KB (3,229 words) - 20:06, 12 March 2018
  • ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. ...dictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs.
    10 KB (1,263 words) - 06:04, 23 December 2014
  • '''Language pair packages''' are standalone JARs that can be run independently as well Since JAR files are nothing but renamed ZIP files, you can easily edit language pair packages to fit your needs. Note that the packages are ready to be use
    11 KB (1,497 words) - 08:23, 7 April 2020
  • ...ogue.com/subgroups/germanic gem]) constitute a branch of the Indo-European language family spoken primarily in Europe, Anglo-America and Australasia. The commo ...ter plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair.
    32 KB (3,684 words) - 06:16, 28 December 2018
  • ...s one of the official languages of India, and has around 33 million native language speakers globally. .../ktpress.org.in/pdf/evolution_of_oriya_language.pdf The Evolution of Oriya Language and Script], ''Utkal University, Cuttack,''
    13 KB (1,770 words) - 06:56, 3 December 2017
  • Make a program which tests Apertium data files for suspicious or unrecommended constructs (likely to be bugs). Some ...x]] (dix) dictionary data, perhaps also transfer rules. The [[Apertium New Language Pair HOWTO]] should introduce most of the terminology and background you ne
    5 KB (789 words) - 10:36, 31 May 2016
  • ...cant] (Alacant, Spain); the other one is [http://www.prompsit.com Prompsit Language Engineering]. These two organizations are currently responsible for most of ...systems to translate less-closely related languages. We have 10 published language pairs, and three more currently in development.
    8 KB (1,255 words) - 19:50, 12 April 2021
  • ...the mnemonic (starting on the first column) must be kept unchanged from a language to another, while the string farther to the right is translated. By defaut, as for lttoolbox, apertium, and the language pairs, the installation is done in <code>/usr/local/bin</code> and <code>/u
    5 KB (789 words) - 12:16, 15 June 2018
  • ...r/>words !! data-sort-type="number"|WER !! data-sort-type="number"|PWER !! data-sort-type="number"|BLEU !! Reference / Notes ...forms that get some analysis, may give an indication of the maturity of a language pair.
    9 KB (1,233 words) - 09:10, 21 November 2021
  • ...Javanese language]]) is an [[Wikipedia:Austronesian languages|Austronesian language]] from Indonesia, spoken by the Javanese people from the central and easter Its language code is '''jv''' and '''jav'''.
    7 KB (881 words) - 13:11, 12 December 2018
  • ...e language pairs (which haven't been started or have currentlu very little data in Apertium) and write an usable version which provides intelligible output * If there is some data for the language pair in the Apertium Github server, check it out and install it.
    2 KB (238 words) - 13:45, 24 February 2023
  • ...guage pairs <code>aa-bb</code> and <code>bb-cc</code> it will create a new language pair for <code>aa-cc</code>. * '''sl-tl''': source language (sl) and target language (tl).
    5 KB (633 words) - 13:29, 6 October 2017
  • ...eof, and following that the development of a prototype pair for a minority language of Russia. Russia has a long history of work in machine translation, but ve ...h oil, as Tatarstan and Sakha) students with good knowledge of a minorised language seldom have a computer and/or access to the internet. That is the case at l
    18 KB (2,991 words) - 22:24, 3 August 2013
  • * Individual repos for each pair, language module, and tool (preserving all commit history). ...ch|talk]]) 13:04, 7 February 2018 (CET) To install apertium and one or two language pairs, you (just) have to follow few wiki pages and then, you get the only
    22 KB (3,325 words) - 14:06, 12 March 2018
  • ...D0%BE%D1%81%D1%81%D0%B8%D0%B8 Šupaškar Apertium Workshop]. Russian part of language pair was created using [[lttoolbox]], and all files, needed for Russian, we === Some data ===
    3 KB (299 words) - 06:39, 30 January 2012
  • ...tps://apertium.github.io/apertium-on-github/source-browser.html. It houses language pairs which haven't completely matured and are under work. ==Specific resources per language==
    10 KB (1,336 words) - 20:40, 11 December 2019
  • {{see-also|Incubator|Specific resources per language}} ...Pair HOWTO|making a language pair]], feel free to make a new page for the language in question and paste it there. Stuff like basic dictionaries, paradigms, r
    1 KB (164 words) - 05:20, 4 December 2019
  • for every sentence s in the source language corpus: for every sentence in the source language corpus:
    6 KB (838 words) - 17:47, 25 July 2012
  • Apertium has some naming conventions for the various files used in language data: Files compiled when you do "make" in a language pair:
    890 bytes (126 words) - 10:10, 14 March 2017
  • ;Get some data! Now try it on your own data.
    5 KB (822 words) - 19:43, 9 March 2020
  • == Data sources == * Often a word can be disambiguated using its translation in another language, for example the triple (estació, gare, station) defines a building meanin
    5 KB (949 words) - 15:27, 15 June 2020
  • ...t plan on working on the core C++ packages (but only want to work on / use language pairs), you can install all prerequisites with yum/zypper, using [[User:Tin For a list of available language pairs and other packages, see https://build.opensuse.org/project/show/home:
    1 KB (231 words) - 10:03, 12 January 2022

View (previous 100 | next 100) (20 | 50 | 100 | 250 | 500)