Search results

Jump to navigation Jump to search
  • Once a transducer has ~80% coverage on a range of medium-large corpora we can say it is "working". Over 90% and it can be considered to be "produc
    12 KB (1,308 words) - 19:27, 27 August 2017
  • ====Parallel corpora==== ...rules of all kinds I’m going to use pol-rus corpora. There are a number of corpora avaiable: a pol-rus section on [http://www.ruscorpora.ru/search-para-pl.htm
    6 KB (969 words) - 01:16, 27 March 2016
  • $ cat ~/corpora/languages/belarusian/wikipedia/bel.crp.txt | apertium -d . bel-rus-morph |
    19 KB (1,389 words) - 08:52, 21 August 2017
  • ...tionary of contemporary Polish" as a source of Polish hand-tagged training corpora (around 500 000 words). It needs though to be converted to the format used * get monolingual and multilingual aligned corpora for further analysis (possibly from JRC Acquis)
    11 KB (1,672 words) - 20:56, 9 April 2010
  • ...of this script it will be easier to automatically generate good rules from corpora and add them. * Complete the following steps in the script (Parallel corpora):
    8 KB (1,205 words) - 10:37, 3 April 2017
  • Once a transducer has ~80% coverage on a range of medium-large corpora we can say it is "working". Over 90% and it can be considered to be "produc
    13 KB (1,494 words) - 07:17, 14 December 2014
  • * Collect parallel kaz-eng corpora! By new plan, we focused on adding vocabulary from 4 corpora.
    20 KB (2,856 words) - 06:26, 27 May 2021
  • * [http://corpus.leeds.ac.uk/query-zh.html A Collection of Chinese Corpora and Frequency Lists.] ===Corpora===
    16 KB (2,148 words) - 03:28, 16 December 2015
  • === Learning transfer rules from small corpora === === Learn morphology from small corpora ===
    6 KB (972 words) - 18:06, 23 December 2022
  • ==Corpora== :I think there are corpora tagged for German, so it shouldn't be hard to train any tagger. I'll have t
    85 KB (13,901 words) - 20:42, 19 June 2009
  • ....za/Faculties/ART/Xhosa/Pages/Research-.aspx "Cross linguistics upon Xhosa Corpora Research"] == Monolingual/Parallel Corpora ==
    4 KB (566 words) - 05:57, 18 April 2020
  • * Collected Kazakh-Uzbek dictionary and parallel corpora; * Prepared 200 sentences of parallel corpora.
    10 KB (1,179 words) - 11:51, 31 August 2021
  • ...ed to make this work.The apertium tagger set needs to be trained for Hindi corpora. The transfer rules also needs to be improved a lot and support for multi w * get monolingual and multilingual aligned corpora for further analysis
    12 KB (1,877 words) - 06:42, 30 April 2013
  • To identify other problems, first I'll translate several corpora using the translator. Then I'll post-edit these, and identify which rules I * Prepare corpora which will be used to in the coding period
    10 KB (1,432 words) - 10:24, 15 May 2014
  • {{deprecated2|Learning rules from parallel and non-parallel corpora}} * a parallel corpus (see [[Corpora]])
    15 KB (2,206 words) - 13:58, 7 October 2014
  • ===Advantages of using parallel corpora in dictionary creation=== *High-quality dictionaries are based on corpora. This linguistic data decreases the role of human intuition during lexicogr
    7 KB (1,010 words) - 17:50, 3 May 2013
  • ...ce that they will be tackled with the new neural approach, as the parallel corpora available are too small. Nice examples are Occitan, Sardinian or Breton. We ...available for a language are neural or statistical black boxes trained on corpora which are not publicly available, then their language communities are disem
    15 KB (2,462 words) - 16:57, 31 January 2019
  • ==Getting corpora== WORDLIST=/home/spectre/corpora/afrikaans-meester-utf8.txt
    16 KB (2,566 words) - 21:36, 15 March 2020
  • ...he sake of variety of topic and possible dialect influence obtaining other corpora will be helpful. *Parallel corpora: Direct translations from Kurmanji to English are sparse enough that it may
    5 KB (809 words) - 06:57, 3 May 2016
  • {{#var | corpora | @= | {{{corpora}}} }} {{#explode | "," | {{#var | corpora}} | {{#var | corporaArr}} }}
    5 KB (583 words) - 03:20, 9 January 2013

View (previous 20 | next 20) (20 | 50 | 100 | 250 | 500)