Search results

Unigram tagger
and the unknown analysis string <code>a<c></code> a score of ...er scores to unknown analysis strings with frequent <math>a</math> than to unknown analysis strings with infrequent <math>a</math> .

20 KB (3,229 words) - 20:06, 12 March 2018
Crimean Tatar and Turkish/Work plan
Number of tokenised words unknown to analyser: 63730 — 43.1 % of tokens had * unknown to bidix: 112 — 0.1 % of tokens had @

4 KB (496 words) - 18:27, 19 June 2017
Task ideas for Google Code-in
...%BBB% and run it through Apertium's %AAA%-%BBB% translator to identify 50 unknown forms. Add the stems of these forms to the analyser in an appropriate way ...%BBB% and run it through Apertium's %AAA%-%BBB% translator to identify 50 unknown forms. Add the stems of these forms to the analyser in an appropriate way

32 KB (4,862 words) - 06:23, 5 December 2019
User:Amanmehta/Application
***About ~20% of unknown words are intransitive verbs ***About ~15% of unknown words are pronouns and their lexicals

11 KB (1,617 words) - 11:06, 29 April 2017
Measuring coverage of HFST transducer
echo "TOP UNKNOWN WORDS:" UNKNOWN=`cat /tmp/$LG.parade.txt | grep '\*' | wc -l`

864 bytes (139 words) - 02:14, 6 September 2019
Incorporating guessing into Apertium
...etc. By the time you finish you should have a reasonable model of missing unknown words. <match case="Aa" unknown="true"><add-reading tags="np.ant"/></match>

4 KB (558 words) - 13:07, 26 June 2020
Contributing
==Adding/fixing unknown words== If you have some words that are unknown in a certain language pair, you can help out by simply writing a list of wo

3 KB (549 words) - 09:17, 26 May 2021
User:Jimregan/English chunk rules
*REGLA: NUM-NOM1-OLD NOM3 unknown -> NOM3 de NUM NOM1(pl) unknown (9-year-old xxxxxx - de 9 anys xxxxxx) - quan va abans d'una desconeguda no *REGLA: DET NUM-NOM1-OLD NOM3 unknown-> DET NOM3 de NUM NOM1(pl) unknown (the 9-year-old xxxxxx - de 9 anys xxxxx)

13 KB (2,422 words) - 14:03, 17 August 2009
User:Aida/Application
# adding unknown words {{tag|postadv}} {{tag|ij}} {{tag|adv}} # adding unknown words {{tag|num}} {{tag|post}} {{tag|prn}} {{tag|det}}

7 KB (1,078 words) - 05:24, 13 May 2014
User:Raveesh/Application
...ve the translation. Another aim would be to remove all kinds of unanalysed/unknown symbols (@,#,*) from the output. <br/> I've already got quite familiar with **Adding unknown tokens from the text to the bilingual dictionary.

10 KB (1,482 words) - 22:05, 21 May 2014
Alphabet
...words as opposed to "blank" chars. Its main effect is on tokenisation of ''unknown'' words, since non-alphabet characters may still be part of a ''known'' wor ...a word not in the dictionary, but composed of alphabetic chars, we get an unknown-word analysis:

2 KB (400 words) - 08:52, 28 April 2014
Как использовать lttoolbox, чтобы разработать новый морфологический анализатор
hsb.dix:25: element s: validity error : IDREF attribute n references an unknown ID "nom" hsb.dix:33: element s: validity error : IDREF attribute n references an unknown ID "nom"

25 KB (2,260 words) - 18:36, 12 January 2012
Compounds
Both [[lttoolbox]] and [[HFST]] have methods for dynamically analysing unknown compound words into their constituent parts. See below for how it's done in ..., and only do compounding if the other methods would give an unknown word. Unknown words are made up of strings of characters from <alphabet>, separated

16 KB (2,689 words) - 09:07, 6 April 2021
Morphology of Turkmen
=== Unknown ===

4 KB (682 words) - 11:14, 16 April 2012
User:Eden/GSoC2019Report
** Number of tokenised words unknown to analyser: 147,573 — '''25.0%''' of tokens had * ** Tokenised words unknown to bidix:0 — '''0.0%''' of tokens had @

5 KB (718 words) - 15:50, 26 August 2019
Xml grep
==I get "Unknown option --xpath"==

5 KB (863 words) - 09:04, 10 October 2017
Apertium-apy
*'''-f --missing-freqs:''' path to sqlite3 database of words that were unknown (requires <code>sudo apt-get install sqlite3</code>) *'''markUnknown=no''' (optional): include this to remove "*" in front of unknown words

38 KB (5,246 words) - 19:54, 1 August 2024
User:Nikant/GsocApplication
...was around 57%.I also analyzed the corpus to get a list of high frequency unknown Hindi words as a part of the coding challenge. These are the results I obtained for the top 20 high frequency unknown words in my corpus:

12 KB (1,877 words) - 06:42, 30 April 2013
Evaluation
Note: Reference translation MUST have no unknown-word marks, even if systems that do not mark unknown words with a star.

6 KB (982 words) - 10:23, 3 September 2024
User:David Nemeskey/GSOC progress 2013
...tions in the FSA, and only one (failed) look-up in the sigma trie for each unknown tag, as opposed to one for each of its letters.''</span> ...7000">''Write a <code>common_detmin_fsa()</code> that, if it sees a symbol unknown to the FSA in state, it treats it as an <code>IDENTITY_SYMBOL</code> if it

34 KB (5,431 words) - 16:27, 29 October 2013

Search results

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools