Search results

Jump to navigation Jump to search
  • Number of unknown words (marked with a star) in test: 284 Percentage of unknown words: 91.91 %
    5 KB (515 words) - 14:34, 1 September 2019
  • ...d into misunderstanding the content, instead of observing that there is an unknown word. ...rmous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of imp
    4 KB (679 words) - 16:06, 3 May 2020
  • Number of unknown words (marked with a star) in test: 653 Percentage of unknown words: 17.48 %
    23 KB (3,704 words) - 11:56, 16 December 2020
  • ==Matching unknown words in Apertium== lttoolbox prepends a star to unknown words, so you can match unknown words using a simple regexp matching that star:
    7 KB (1,116 words) - 20:57, 2 April 2021
  • ...the OR operation, the rules would try to match precisely a sequence of one unknown word followed by one known one. ====Matching an unknown word====
    19 KB (2,820 words) - 15:26, 11 April 2023
  • Number of unknown words (marked with a star) in test: 203<br/> Percentage of unknown words: 8,24 %<br/>
    81 KB (13,134 words) - 16:48, 30 September 2011
  • | <code>GD</code> || Gender to be determined || || <!-- unknown --> ...ity to be determined || if the sub-category is (currently) unknown || <!-- unknown -->
    38 KB (4,492 words) - 15:36, 9 May 2024
  • ...we had approx 13,000 entries. Approx half of the training sentences had an unknown word. With this we got very poor tagger performance. Then we added 7,000 pr ...My dix is not big enough, and approx half of the training sentences has an unknown word. Can't I just grep these sentences away, and then train on the rest?
    7 KB (1,177 words) - 08:34, 8 October 2014
  • and the unknown analysis string <code>a&lt;c&gt;</code> a score of ...er scores to unknown analysis strings with frequent <math>a</math> than to unknown analysis strings with infrequent <math>a</math> .
    20 KB (3,229 words) - 20:06, 12 March 2018
  • ...words as opposed to "blank" chars. Its main effect is on tokenisation of ''unknown'' words, since non-alphabet characters may still be part of a ''known'' wor ...a word not in the dictionary, but composed of alphabetic chars, we get an unknown-word analysis:
    2 KB (400 words) - 08:52, 28 April 2014
  • Number of tokenised words unknown to analyser: 63730 — 43.1 % of tokens had * unknown to bidix: 112 — 0.1 % of tokens had @
    4 KB (496 words) - 18:27, 19 June 2017
  • ...%BBB% and run it through Apertium's %AAA%-%BBB% translator to identify 50 unknown forms. Add the stems of these forms to the analyser in an appropriate way ...%BBB% and run it through Apertium's %AAA%-%BBB% translator to identify 50 unknown forms. Add the stems of these forms to the analyser in an appropriate way
    32 KB (4,862 words) - 06:23, 5 December 2019
  • echo "TOP UNKNOWN WORDS:" UNKNOWN=`cat /tmp/$LG.parade.txt | grep '\*' | wc -l`
    864 bytes (139 words) - 02:14, 6 September 2019
  • ...etc. By the time you finish you should have a reasonable model of missing unknown words. <match case="Aa" unknown="true"><add-reading tags="np.ant"/></match>
    4 KB (558 words) - 13:07, 26 June 2020
  • ==Adding/fixing unknown words== If you have some words that are unknown in a certain language pair, you can help out by simply writing a list of wo
    3 KB (549 words) - 09:17, 26 May 2021
  • Number of unknown words (marked with a star) in test: 117<br/> Percentage of unknown words: 3,87 %<br/>
    6 KB (845 words) - 20:08, 3 October 2011
  • Both [[lttoolbox]] and [[HFST]] have methods for dynamically analysing unknown compound words into their constituent parts. See below for how it's done in ..., and only do compounding if the other methods would give an unknown word. Unknown words are made up of strings of characters from &lt;alphabet&gt;, separated
    16 KB (2,689 words) - 09:07, 6 April 2021
  • Number of unknown words (marked with a star) in test: 117<br/> Percentage of unknown words: 3,87 %<br/>
    98 KB (16,331 words) - 20:28, 30 September 2011
  • Note: Reference translation MUST have no unknown-word marks, even if systems that do not mark unknown words with a star.
    6 KB (981 words) - 09:13, 21 November 2021
  • hsb.dix:25: element s: validity error : IDREF attribute n references an unknown ID "nom" hsb.dix:33: element s: validity error : IDREF attribute n references an unknown ID "nom"
    19 KB (3,440 words) - 12:10, 26 September 2016

View (previous 20 | next 20) (20 | 50 | 100 | 250 | 500)