Search results

Jump to navigation Jump to search
  • ...t that they could increase the coverage significantly, because the testing corpora are either news or WP).
    8 KB (1,205 words) - 21:50, 19 July 2012
  • ...rds is a semi-standard convention (it's occurring at least some in all the corpora). We should figure out where this is happening and see if it's something w
    28 KB (769 words) - 11:34, 13 April 2013
  • {{see-also|Corpora}}
    11 KB (1,750 words) - 13:24, 10 December 2010
  • .... (2008) "Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation". ''Machine Translatio
    8 KB (1,301 words) - 09:43, 6 October 2014
  • WORDLIST=/home/spectre/corpora/afrikaans-meester-utf8.txt
    11 KB (1,852 words) - 07:04, 8 October 2014
  • ...ns: apertium-kaz-tat has at least 15000 top stems, 95% coverage on all the corpora we have, and no more than 15% Word-Error-Rate on any randomly selected text
    4 KB (603 words) - 21:20, 31 August 2015
  • * Optimised for small corpora (under 100k parallel sentences)
    869 bytes (111 words) - 15:06, 29 June 2020
  • ...li08j.pdf Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation]". ''Machine Translati
    8 KB (1,273 words) - 09:32, 3 May 2024
  • The corpora used for this task can be found here: http://www.statmt.org/europarl/v7/sl-
    6 KB (625 words) - 16:54, 1 July 2013
  • Before you start you first need a [[Corpora|corpus]]. Look in apertium-eo-en/corpa/enwiki.crp.txt.bz2 (run bunzip2 -c e
    6 KB (966 words) - 20:16, 23 July 2021
  • ...learning to construct such n-level transducers, working with some learning corpora, and mostly using the OSTIA state-merging algorithm.
    6 KB (842 words) - 06:41, 20 October 2014
  • -x, --xml Output corpora in XML format
    9 KB (1,003 words) - 11:02, 30 August 2011
  • Avant que vous vous commenciez vous avez d'abord besoin d'un [[Corpora|corpus]]. Regardez dans apertium-eo-en/corpa/enwiki.crp.txt.bz2 ? Lancez
    7 KB (1,057 words) - 11:52, 7 October 2014
  • * Efficiency: Make it scale up to corpora of millions of words. This might involve doing (a) pre-analysis of the corp
    3 KB (549 words) - 02:11, 10 March 2018
  • ...to make a translation guesser using the existing bidix and two monolingual corpora in a similar way.
    4 KB (558 words) - 13:07, 26 June 2020
  • ...ingual dictionaries: At the beginning we started using Chinese and Spanish corpora in order to obtain lots of Chinese-Spanish word pairs. Using the Stanford S
    7 KB (830 words) - 21:33, 30 September 2013
  • ...story] (or [https://github.com/taruen/apertiumpp/tree/master/data4apertium/corpora/jam from here] ) as possible — Minimum one sentence. ...stvoc]] clean, and has a coverage of around 80% or more on a range of free corpora.
    6 KB (1,024 words) - 15:22, 20 April 2021
  • * [[Corpora]]
    1 KB (164 words) - 05:20, 4 December 2019
  • ...l trained on prepared datasets which were made from parsed syntax-labelled corpora (mostly UD-treebanks). The classifier analyzes the given sequence of morpho
    5 KB (764 words) - 01:40, 8 March 2018
  • |0|| || collecting Tatar and Bashkir corpora, scraping a parallel corpus, making a frequency dictionary
    2 KB (228 words) - 10:55, 9 May 2018

View (previous 20 | next 20) (20 | 50 | 100 | 250 | 500)