Search results
Jump to navigation
Jump to search
- ...t that they could increase the coverage significantly, because the testing corpora are either news or WP).8 KB (1,205 words) - 21:50, 19 July 2012
- ...rds is a semi-standard convention (it's occurring at least some in all the corpora). We should figure out where this is happening and see if it's something w28 KB (769 words) - 11:34, 13 April 2013
- {{see-also|Corpora}}11 KB (1,750 words) - 13:24, 10 December 2010
- .... (2008) "Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation". ''Machine Translatio8 KB (1,301 words) - 09:43, 6 October 2014
- WORDLIST=/home/spectre/corpora/afrikaans-meester-utf8.txt11 KB (1,852 words) - 07:04, 8 October 2014
- ...ns: apertium-kaz-tat has at least 15000 top stems, 95% coverage on all the corpora we have, and no more than 15% Word-Error-Rate on any randomly selected text4 KB (603 words) - 21:20, 31 August 2015
- * Optimised for small corpora (under 100k parallel sentences)869 bytes (111 words) - 15:06, 29 June 2020
- ...li08j.pdf Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation]". ''Machine Translati8 KB (1,273 words) - 09:32, 3 May 2024
- The corpora used for this task can be found here: http://www.statmt.org/europarl/v7/sl-6 KB (625 words) - 16:54, 1 July 2013
- Before you start you first need a [[Corpora|corpus]]. Look in apertium-eo-en/corpa/enwiki.crp.txt.bz2 (run bunzip2 -c e6 KB (966 words) - 20:16, 23 July 2021
- ...learning to construct such n-level transducers, working with some learning corpora, and mostly using the OSTIA state-merging algorithm.6 KB (842 words) - 06:41, 20 October 2014
- -x, --xml Output corpora in XML format9 KB (1,003 words) - 11:02, 30 August 2011
- Avant que vous vous commenciez vous avez d'abord besoin d'un [[Corpora|corpus]]. Regardez dans apertium-eo-en/corpa/enwiki.crp.txt.bz2 ? Lancez7 KB (1,057 words) - 11:52, 7 October 2014
- * Efficiency: Make it scale up to corpora of millions of words. This might involve doing (a) pre-analysis of the corp3 KB (549 words) - 02:11, 10 March 2018
- ...to make a translation guesser using the existing bidix and two monolingual corpora in a similar way.4 KB (558 words) - 13:07, 26 June 2020
- ...ingual dictionaries: At the beginning we started using Chinese and Spanish corpora in order to obtain lots of Chinese-Spanish word pairs. Using the Stanford S7 KB (830 words) - 21:33, 30 September 2013
- ...story] (or [https://github.com/taruen/apertiumpp/tree/master/data4apertium/corpora/jam from here] ) as possible — Minimum one sentence. ...stvoc]] clean, and has a coverage of around 80% or more on a range of free corpora.6 KB (1,024 words) - 15:22, 20 April 2021
- * [[Corpora]]1 KB (164 words) - 05:20, 4 December 2019
- ...l trained on prepared datasets which were made from parsed syntax-labelled corpora (mostly UD-treebanks). The classifier analyzes the given sequence of morpho5 KB (764 words) - 01:40, 8 March 2018
- |0|| || collecting Tatar and Bashkir corpora, scraping a parallel corpus, making a frequency dictionary2 KB (228 words) - 10:55, 9 May 2018