Search results

Apertium-nog/stats
== Corpora ==

1 KB (176 words) - 06:05, 16 December 2014
Apertium-ukr/stats
== Corpora ==

1 KB (158 words) - 05:29, 22 August 2017
Aromanian
===Corpora===

8 KB (1,048 words) - 05:32, 1 December 2017
Kashmiri
== Other Corpora ==

6 KB (811 words) - 10:42, 2 July 2018
Apertium-uzb/stats
== Corpora ==

3 KB (324 words) - 21:41, 15 December 2019
Apertium-sqi/stats
== Corpora ==

1 KB (154 words) - 06:16, 16 December 2014
Apertium-fin/stats
== Corpora ==

328 bytes (35 words) - 05:49, 16 December 2014
English and Catalan/Workplan
* Tagger training preparation (tagged corpora unification) * Tagger training preparation (tagged corpora unification)

5 KB (506 words) - 14:56, 28 August 2017
Apertium-kum/stats
== Corpora ==

2 KB (235 words) - 05:58, 16 December 2014
Apertium Turkic
...Pirinen, Jonathan Washington (2015). "Finite-state morphologies and text corpora as resources for improving morphological descriptions". [https://sites.goog

13 KB (1,710 words) - 20:32, 30 August 2018
Unsupervised tagger training
# the best taggers use hand-tagged corpora to train with (we use untagged corpora -- for English)

7 KB (1,177 words) - 08:34, 8 October 2014
Traductions en français
* [[Corpora]]

13 KB (1,601 words) - 23:31, 23 July 2021
Lextor
to be related. Both corpora must be pre-processed before the training. This pre-processing, consisting in analysing the corpora and

11 KB (1,814 words) - 03:22, 9 March 2019
Helsinki Apertium Workshop/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,683 words) - 08:42, 10 May 2013
Tartu Apertium Course/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,683 words) - 11:00, 30 October 2015
Automatically generating compound bidix entries
...toscore.txt | ~/source/apertium/trunk/apertium-lex-learner/irstlm-ranker ~/corpora/català/en.blm > unk.trans.scored.txt

9 KB (1,470 words) - 11:28, 24 March 2012
Курсы машинного перевода для языков России/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,679 words) - 12:00, 31 January 2012
Turkic-Turkic translator
* '''Coverage''' is the naïve coverage over one or more free corpora.

6 KB (591 words) - 22:50, 30 October 2017
Quick and dirty guide addendum: other important things
...steps to create each one. But you'll want to build it up, testing against corpora, etc. You want to be able to have as many correct analyses as possible, an ...if your pair will support translation in both directions). The corpus or corpora ideally should represent a range of content—i.e., it shouldn't be just sp

10 KB (1,615 words) - 07:43, 20 December 2015
Apertium cat-srd and ita-srd/GSoC 2017
...dinian to Italian. We started a manual morphological disambiguation of the corpora that will help the translator to recognize the correct morphology of each w We have treated two corpora: one journalistic and more dialectal, and other taken directly from literar

9 KB (1,306 words) - 15:56, 2 September 2017

Search results

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools