Search results
Jump to navigation
Jump to search
- ...statistical parser, which in turn can serve different purposes of natural language processing. For creating a good treebank, manual annotation and/or disambig ...interface allows to work with CoNLL-U and CG3 formats, and to convert the data between the formats. It also allows to either upload or paste corpora in pl6 KB (930 words) - 15:59, 29 August 2017
- ...d of existing trained models. Successful tries are saved into new training data.<ref>https://static.googleusercontent.com/media/research.google.com/en//pub ...butions can also be found [https://github.com/tesseract-ocr/tesseract/wiki/Data-Files-Contributions here].2 KB (305 words) - 14:36, 28 October 2018
- ...rning engineer. My role was developing sentiment analysis model for Arabic language. ...urses, I had to use python/ R and Tableau to perform analysis on different data-sets.8 KB (1,258 words) - 15:30, 27 April 2020
- ..., transfer rules, scripting, corpora. The objective is to make an Apertium language pair state-of-the-art, or close to state-of-the-art in terms of translation ...ge pair of your choice in Apertium and install it. (see [[Install language data by compiling]])2 KB (383 words) - 19:46, 2 March 2023
- | 64 || Apertium-tolk should give proper warning when no linguistic data is installed || 2008-03-31 || Wynand Winte ...rg/cgi-bin/bugzilla/index.cgi here]. Please feel to report your bug in any language you are comfortable with.12 KB (1,254 words) - 22:08, 7 March 2018
- | clip || - || N/A || part → value || Obtains the part in the only language there is (inter/post-chunk) and pushes the value onto the stack ...|| - || link-to || part, pos → value || Obtains the 'part' in source language in position 'pos' and pushes the 'value' onto the stack. An optional operan14 KB (2,020 words) - 13:58, 7 October 2014
- ...ion is a very complex problem that depends on almost all fields of natural language processing. As such, it is a very "enabling" field, and can benefit from th ...ings of the 9th International Workshop on Finite State Methods and Natural Language Processing, pages 39--47.</ref>. However, the library currently used to par10 KB (1,561 words) - 15:22, 28 May 2013
- While training can be done directly in the language directory, it is a better idea to train the tagger with copies of the files ...e the training directory (replace <code>lang</code> with the corresponding language code).4 KB (651 words) - 13:36, 23 August 2017
- {{Language Kashmiri is an Indo-Aryan language spoken in the Kashmir Valley and regions around it that were historically a6 KB (811 words) - 10:42, 2 July 2018
- == Proposal: Bringing 4 language pairs up to release quality == ...stvoc and lexical selection that will result in a valid text in the target language.4 KB (614 words) - 13:00, 7 April 2019
- ** Select a language ** Use the Apertium morphological analyser to analyse the test data1 KB (213 words) - 21:13, 18 March 2019
- ...s, data, and other system resources with applications, software tools, and data of the Unix-like environment. Therefore it is possible to launch Windows ap Now you're ready to download and build language pairs and use them under Cygwin's shell.12 KB (1,883 words) - 22:06, 7 March 2018
- ...is it possible to achieve pretty good results having very small amount of data (like in case of Breton) ...ad of the original syntax module in kmr-eng pipeline. The testpack for two language pairs was built. All code was cleaned up, some docstrings were written. Als6 KB (833 words) - 12:56, 22 August 2017
- * répertoire es-tagger-data : Contient les données nécessaires pour le tagger espagnol (corpus, etc.) * répertoire ca-tagger-data : Contient les données nécessaires pour le tagger catalan (corpus, etc.)54 KB (8,480 words) - 18:55, 10 April 2017
- If you want to work on Apertium language pairs or tools, some knowledge of the Unix shell / command-line scripting w ...hell/ shell scripting] and [https://hacker-tools.github.io/data-wrangling/ Data wrangling] are relevant and succinct746 bytes (101 words) - 09:20, 8 February 2019
- ** We can haz. Data is now checked in on Victorio at /langtech/trunk/words/dicts/algu, with a r ...ns Finnish and Northern Sámi. Ryan can contact them if it seems like their data would be of use.16 KB (2,457 words) - 08:19, 12 April 2017
- .../presentation/d/1LBcBs3KdzfS7vl6Sxe0UtOMLpWNMM6ciGS_YPCnxTr0 Reading-bound data as inline secondary tags]", Tino Didriksen *** "Reading-bound data is best transported as inline secondary tags, proven both by practical expe3 KB (509 words) - 15:49, 2 July 2020
- ...our language data directory (replacing "apertium-foo" for your monolingual data dir):725 bytes (111 words) - 09:24, 2 March 2016
- tsv-file: past-tense-tests.tsv # read the test data from a tab-separated list ...as a test that can pass or fail) or in interactive mode (which updates the data to reflect the state of the translator).9 KB (1,402 words) - 16:40, 2 March 2021
- By defaut, as for lttoolbox, apertium, and the language pairs, the installation is done in <code>/usr/local/bin</code> and <code>/u ...ium</code> command, there is the '''<code>-f</code>''' option to translate data produced in this format without having to call "by hand" a deformatter and5 KB (780 words) - 11:48, 15 June 2018