Search results

Jump to navigation Jump to search
  • ...statistical parser, which in turn can serve different purposes of natural language processing. For creating a good treebank, manual annotation and/or disambig ...interface allows to work with CoNLL-U and CG3 formats, and to convert the data between the formats. It also allows to either upload or paste corpora in pl
    6 KB (930 words) - 15:59, 29 August 2017
  • ...d of existing trained models. Successful tries are saved into new training data.<ref>https://static.googleusercontent.com/media/research.google.com/en//pub ...butions can also be found [https://github.com/tesseract-ocr/tesseract/wiki/Data-Files-Contributions here].
    2 KB (305 words) - 14:36, 28 October 2018
  • ...rning engineer. My role was developing sentiment analysis model for Arabic language. ...urses, I had to use python/ R and Tableau to perform analysis on different data-sets.
    8 KB (1,258 words) - 15:30, 27 April 2020
  • ..., transfer rules, scripting, corpora. The objective is to make an Apertium language pair state-of-the-art, or close to state-of-the-art in terms of translation ...ge pair of your choice in Apertium and install it. (see [[Install language data by compiling]])
    2 KB (383 words) - 19:46, 2 March 2023
  • | 64 || Apertium-tolk should give proper warning when no linguistic data is installed || 2008-03-31 || Wynand Winte ...rg/cgi-bin/bugzilla/index.cgi here]. Please feel to report your bug in any language you are comfortable with.
    12 KB (1,254 words) - 22:08, 7 March 2018
  • | clip || - || N/A || part &rarr; value || Obtains the part in the only language there is (inter/post-chunk) and pushes the value onto the stack ...|| - || link-to || part, pos &rarr; value || Obtains the 'part' in source language in position 'pos' and pushes the 'value' onto the stack. An optional operan
    14 KB (2,020 words) - 13:58, 7 October 2014
  • ...ion is a very complex problem that depends on almost all fields of natural language processing. As such, it is a very "enabling" field, and can benefit from th ...ings of the 9th International Workshop on Finite State Methods and Natural Language Processing, pages 39--47.</ref>. However, the library currently used to par
    10 KB (1,561 words) - 15:22, 28 May 2013
  • While training can be done directly in the language directory, it is a better idea to train the tagger with copies of the files ...e the training directory (replace <code>lang</code> with the corresponding language code).
    4 KB (651 words) - 13:36, 23 August 2017
  • {{Language Kashmiri is an Indo-Aryan language spoken in the Kashmir Valley and regions around it that were historically a
    6 KB (811 words) - 10:42, 2 July 2018
  • == Proposal: Bringing 4 language pairs up to release quality == ...stvoc and lexical selection that will result in a valid text in the target language.
    4 KB (614 words) - 13:00, 7 April 2019
  • ** Select a language ** Use the Apertium morphological analyser to analyse the test data
    1 KB (213 words) - 21:13, 18 March 2019
  • ...s, data, and other system resources with applications, software tools, and data of the Unix-like environment. Therefore it is possible to launch Windows ap Now you're ready to download and build language pairs and use them under Cygwin's shell.
    12 KB (1,883 words) - 22:06, 7 March 2018
  • ...is it possible to achieve pretty good results having very small amount of data (like in case of Breton) ...ad of the original syntax module in kmr-eng pipeline. The testpack for two language pairs was built. All code was cleaned up, some docstrings were written. Als
    6 KB (833 words) - 12:56, 22 August 2017
  • * répertoire es-tagger-data : Contient les données nécessaires pour le tagger espagnol (corpus, etc.) * répertoire ca-tagger-data : Contient les données nécessaires pour le tagger catalan (corpus, etc.)
    54 KB (8,480 words) - 18:55, 10 April 2017
  • If you want to work on Apertium language pairs or tools, some knowledge of the Unix shell / command-line scripting w ...hell/ shell scripting] and [https://hacker-tools.github.io/data-wrangling/ Data wrangling] are relevant and succinct
    746 bytes (101 words) - 09:20, 8 February 2019
  • ** We can haz. Data is now checked in on Victorio at /langtech/trunk/words/dicts/algu, with a r ...ns Finnish and Northern Sámi. Ryan can contact them if it seems like their data would be of use.
    16 KB (2,457 words) - 08:19, 12 April 2017
  • .../presentation/d/1LBcBs3KdzfS7vl6Sxe0UtOMLpWNMM6ciGS_YPCnxTr0 Reading-bound data as inline secondary tags]", Tino Didriksen *** "Reading-bound data is best transported as inline secondary tags, proven both by practical expe
    3 KB (509 words) - 15:49, 2 July 2020
  • ...our language data directory (replacing "apertium-foo" for your monolingual data dir):
    725 bytes (111 words) - 09:24, 2 March 2016
  • tsv-file: past-tense-tests.tsv # read the test data from a tab-separated list ...as a test that can pass or fail) or in interactive mode (which updates the data to reflect the state of the translator).
    9 KB (1,402 words) - 16:40, 2 March 2021
  • By defaut, as for lttoolbox, apertium, and the language pairs, the installation is done in <code>/usr/local/bin</code> and <code>/u ...ium</code> command, there is the '''<code>-f</code>''' option to translate data produced in this format without having to call "by hand" a deformatter and
    5 KB (780 words) - 11:48, 15 June 2018

View (previous 20 | next 20) (20 | 50 | 100 | 250 | 500)