Search results

Manx
==Parallel corpora==

766 bytes (87 words) - 08:07, 20 January 2009
Crimean Tatar
== Electronic Corpora ==

335 bytes (29 words) - 16:39, 24 May 2017
Apertium-tat/stats
== Corpora ==

3 KB (414 words) - 21:40, 15 December 2019
Apertium-rus/stats
== Corpora ==

1 KB (158 words) - 05:28, 22 August 2017
Hausa
'''CORPORA''' ...sketchengine.eu/user-guide/user-manual/corpora/by-language/hausa-boko-text-corpora/

1 KB (179 words) - 18:40, 26 October 2018
Apertium-kjh/stats
== Corpora ==

511 bytes (54 words) - 20:57, 15 February 2015
Apertium-mlt/stats
== Corpora ==

1 KB (135 words) - 06:05, 16 December 2014
Apertium-dan/stats
== Corpora ==

1 KB (135 words) - 06:59, 28 June 2016
Apertium-uig/stats
== Corpora ==

2 KB (213 words) - 06:51, 6 July 2018
Apertium-yid/stats
==Corpora==

2 KB (212 words) - 04:26, 2 January 2019
Unsupervised tagger training
# the best taggers use hand-tagged corpora to train with (we use untagged corpora -- for English)

7 KB (1,177 words) - 08:34, 8 October 2014
Traductions en français
* [[Corpora]]

13 KB (1,601 words) - 23:31, 23 July 2021
Automatically generating compound bidix entries
...toscore.txt | ~/source/apertium/trunk/apertium-lex-learner/irstlm-ranker ~/corpora/català/en.blm > unk.trans.scored.txt

9 KB (1,470 words) - 11:28, 24 March 2012
Tartu Apertium Course/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,683 words) - 11:00, 30 October 2015
Lextor
to be related. Both corpora must be pre-processed before the training. This pre-processing, consisting in analysing the corpora and

11 KB (1,814 words) - 03:22, 9 March 2019
Helsinki Apertium Workshop/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,683 words) - 08:42, 10 May 2013
Курсы машинного перевода для языков России/Session 8
The final coverage of the system was around 90%, e.g. over a set of corpora 10 unknown words out of 100 on average. The word-error rate was around 17%, When linguistic resources, for example corpora, dictionaries, grammars, morphological analysers, lists of lemmata etc. are

12 KB (1,679 words) - 12:00, 31 January 2012
Turkic-Turkic translator
* '''Coverage''' is the naïve coverage over one or more free corpora.

6 KB (591 words) - 22:50, 30 October 2017
Quick and dirty guide addendum: other important things
...steps to create each one. But you'll want to build it up, testing against corpora, etc. You want to be able to have as many correct analyses as possible, an ...if your pair will support translation in both directions). The corpus or corpora ideally should represent a range of content—i.e., it shouldn't be just sp

10 KB (1,615 words) - 07:43, 20 December 2015
Apertium cat-srd and ita-srd/GSoC 2017
...dinian to Italian. We started a manual morphological disambiguation of the corpora that will help the translator to recognize the correct morphology of each w We have treated two corpora: one journalistic and more dialectal, and other taken directly from literar

9 KB (1,306 words) - 15:56, 2 September 2017

Search results

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools