Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Frequently Asked Questions

From Apertium
(Difference between revisions)
Jump to: navigation, search
(Delete page)
 
Line 1: Line 1:
[[Questions fréquentes|En français]]
 
 
There are many ways to contribute to Apertium, from sending us lists of words or phrases you find that are incorrectly translated, to getting involved in creating a new language pair or programming on tools or user interfaces. Here are some question frequently asked by users.
 
 
==How do I start off?==
 
 
Regardless of the kind of contribution you want to do, the two things to start with are to subscribe to the mailing list [https://lists.sourceforge.net/lists/listinfo/apertium-stuff apertium-stuff], which is where most of the discussion goes on. Also, come and idle on the [[IRC|IRC channel]] <code>#apertium</code> on <code>irc.freenode.net</code>.
 
 
To decide what you want to contribute, take a look at ''[[Development]]'' and ''[[Projects]]'' for some ideas we've had around programming, extending the engine, and have a look at the ''[[Incubator]]'' if you're interested in linguistic issues. If you can't find anything that interests you or piques your interest, just send an email or ask someone on IRC and they'll be happy to help.
 
 
==How do I add or fix words?==
 
If you have some words that are unknown in a certain language pair, you can help out by simply writing list of words and their translations, e.g.
 
<pre>
 
house; noun; casa; noun f
 
dog; noun; perro; noun m
 
</pre>
 
 
into a file, and sending that to the [[mailing list]]. Most likely you want to send to the one called "apertium-stuff"; [https://lists.sourceforge.net/lists/listinfo/apertium-stuff subscribe here], then attach the file and send it to apertium-stuff@lists.sourceforge.net.
 
 
You can also send a spreadsheet file—if you prefer that.
 
 
==How can I contribute my knowledge?==
 
The [[Indirect contribution guide]] has some tips on how to contribute your knowledge of a language to create resources that we use in Apertium, such as
 
* Writing contrastive analyses
 
* Cataloguing resources
 
* Hand-translating text
 
* Converting dictionaries
 
* Contributing to related projects
 
 
==How do I get more involved?==
 
The first thing you should do if you want to get more involved is to introduce yourself on the [[mailing list]] and hang out on our [[IRC]] channel. There is also a [[list of Apertium mentors]].
 
 
Next, you should [[Installation|install apertium, lttoolbox and some language pair]] to play around with.
 
 
If you want to create or contribute to a language pair, go through the [[New language pair HOWTO]]. This is required reading for anyone who wants to get involved with developing Apertium language pairs. Also, take a look at [[Contributing to an existing pair]], meant for those who want to contribute to existing language pairs. You can improve the quality of the translation for an existing pair by correcting errors in the dictionaries. You will find some hints on the page [[Finding_errors_in_dictionaries]].
 
 
Next up, the [https://www.abumatran.eu/wp-content/uploads/2014/12/abumatran-apertium-workshop-data-guide.pdf Apertium EU Workshop site] is a comprehensive guide to rule based machine translation with Apertium (originally made for a four-day course on Apertium for people with little background in machine translation); print this out and read it on the bus/train/boat
 
 
If you're a student, [[Google Summer of Code]] or [https://codein.withgoogle.com/ Google Code-In] for high-school students) is a good way to get involved with Apertium, and the ideas page there has lots of project tips if you're more interested in programming than linguistics/language pairs. If you are on the task of requesting a wiki account and adopting a page, contact a mentor to request an account to gain access to edit the wiki.
 
 
==Why are you using XML and not a database?==
 
XML is not a really inefficient format to store dictionaries. With all these spaces and tags, they are complicated to read. Would it not be better to have all the information in a database, like Postgres or MySQL? Or even in ordinary text files?
 
 
* Each data item is explicitly tagged with a descriptive tag named with a clear meaning associated with it
 
* Document structure can be easily validated using DTDs or schemas
 
* Several technologies exist for XML (conversion to and from XML, interoperability).
 
* XML is quite easy to process with word processing tools like sed, cut and awk.
 
* You can read more practical and theoretical details about our format to memorize the dictionaries here: ''[[Morphological dictionary]]''.
 
 
==Does Apertium support separable verbs?==
 
Several languages, for example, most of the Germanic languages ​​(with the exception of English) and the Hungarian have a phenomenon called "separable verbs", also called "attached prepositions" or by other names. This is when the verb's infinitive has a part that is detached and displaced when the verb is conjugated. For example in Afrikaans, the verb "to announce" is "aankondig". The part "aan" is separated when the verb is conjugated, so for example:
 
 
* Sterrekundiges '''kondig''' [die ontdekking] '''aan'''.
 
* Astronomers '''announce''' [the discovery].
 
 
The stem "kondig" does not by itself mean anything, only in conjunction with the particle "aan;" however, this is not always the case. The past participle is formed by inserting "ge" in between the particle and the stem, for example:
 
 
* Sterrekundiges '''het''' [die ontdekking] '''aangekondig'''.
 
* Astronomers '''have announced''' [the discovery].
 
 
Essentially no, for the moment we do not support separable verbs. The problem for Apertium occurs when the non-separated part does not mean anything, it is for the moment impossible to analyze a word in two parts when they are separated by something as nebulous as a nominal group (NP). There are a number of hacks that can be tried to work around this deficiency, but none of them work properly. If you would like more information on this, or have ideas on how to deal with it or cope with it, please see our [[Separable verbs]] page.
 
 
[[Category:Documentation in English]]
 

Latest revision as of 09:15, 4 December 2019

Personal tools