Difference between revisions of "User:Kiara"

From Apertium
Jump to navigation Jump to search
Line 384: Line 384:
   
 
<code>
 
<code>
from langdetect import detector_factory
+
<pre>from langdetect import detector_factory
   
 
detector_factory.init_factory()
 
detector_factory.init_factory()
   
print(detector_factory._factory.langlist)
+
print(detector_factory._factory.langlist)</pre>
   
 
</code>
 
</code>

Revision as of 12:38, 10 July 2016

Kiara's page


Suggestion task:

Notes

1. How to work with APY from the command line: http://wiki.apertium.org/wiki/APY#Usage

2. How to launch Suggestions from the command line https://github.com/goavki/apertium-html-tools/pull/35


Suggestion docs:

This is for the apy page


use ./servlet.py /usr/local/share/apertium/ --wiki-username=WikiUsername --wiki-password=WikiPassword -rs=YourRecaptchaSecret to run apy in google reCaptcha mode

  • -b --bypass-token: testing token is generated to bypass recaptcha

URL Function Parameters Output
/suggest Generate a suggestion on target wiki-page using a testing token.
  • context: sentence
  • word: word that will be sugested
  • newWord: suggestion
  • langpair: language pair to use for translation
  • g-recaptcha-response: testing token generated when running apy (note that only testing token can be used with curl)
Returns the status. If "Success", the suggestion is posted on the target wiki-page.

Note that the correct wiki-page url is required (wiki_util.py)


For production usage of Google reCaptcha the registration is required (https://developers.google.com/recaptcha/).

Note that correct keys are required when starting apy and in the html-tools config file.

curl --data 'context=otro+mundo&word=*mundo&newWord=MUNDO&langpair=esp|eng&g-recaptcha-response=testingToken' http://localhost:2737/suggest
{"responseStatus": 200, "responseData": {"status": "Success"}, "responseDetails": null}

This is for the html-tools page:

  • ENABLED: turns on the suggestion mode (True/False)
  • RECAPTCHA_SITE_KEY: recaptcha site key which can be obtained by registration at https://developers.google.com/recaptcha/
  • CONTEXT_WRAP: a number of context words from the left


Speller backlog:

1. Localize 'Any ideas?' fixed and question

2. Punctuation fixed

3. Documentation

4. Button glitch fixed

5. Hovering over a misspelled word highlights it in black, with a second underline. fixed

6. After a word has been updated, it stays red, even though the underline disappears fixed

7. An error message for missed -translate mode fixed and question


Language detection

Apertium code langdetect code Language
af af Afrikaans
ara ar Arabic
an N/A + Aragonese
ast N/A Asturian
bg bg Bulgarian
bn Bengali
br N/A Breton
ca ca Catalan
cs Czech
cy cy Welsh
dan da Danish
de German
el Greek
en en English
eo N/A Esperanto
es es Spanish
et Estonian
eu N/A Basque
fa Persian
fi Finnish
fra fr French
gl N/A Galician
gu Gujarati
he Hebrew
hin hi Hindi
hr Croatian
hu Hungarian
id id Indonesian
is N/A Icelandic
it it Italian
ja Japanese
kaz N/A (kk) Kazakh
kn Kannada
ko Korean
lt Lithuanian
lv Latvian
mk mk Macedonian
ml Malayalam
mr Marathi (Marāṭhī)
ms N/A Malaysian
mt N/A Maltese
nob N/A (nb) Bokmål
ne Nepali
nl nl Dutch
nno N/A (nn) Norwegian Nynorsk
nor no Norwegian
oc N/A + Occitan
pa Panjabi
pl Polish
pt pt Portuguese
ro ro Romanian
ru Russian
hbs N/A (sh) Serbo-Croatian
sme N/A (se) + Northern Sami
sk Slovak
slv sl Slovenian
so Somali
sq Albanian
swe (sv) sv Swedish
sw Swahili
ta Tamil
te Telugu
th Thai
tl Tagalog
tr Turkish
tat N/A (tt) Tatar
uk Ukrainian
urd ur Urdu
vi Vietnamese
N/A zh-cn Chinese (Simplified and using Mainland Chinese terms)
N/A zh-tw Chinese (Traditional and using Taiwanese terms)

How to train a new language model:

1. Install Langdetect library and follow the instructions here https://github.com/Mimino666/langdetect#how-to-add-new-language

2. Locate the folder where Langdetect is installed

3. Copy the new language model to the Profiles folder

4. Initiate the new model:

from langdetect import detector_factory 

detector_factory.init_factory()

print(detector_factory._factory.langlist)