Frequently Asked Questions
From Apertium
|
[edit] Why do you use XML and not a database?
Isn't XML a really inefficient format for storing dictionaries, all that whitespace and tags, they're complicated to read, wouldn't it be better to have all the information in a database, like Postgres or MySQL ? Or even in flat text files?
- Answer
- Each data item is explicitly labelled with a descriptive, named tag with a clear meaning attached
- The structure of documents may easily be validated against DTDs or schemas
- Many technologies exist for XML (converting from and to XML, interoperability).
- XML is quite easy to process with text-processing tools like sed, cut and awk.
You can read more in a practical and theoretical overview about our format for storing dictionaries here: Morphological dictionaries.
[edit] Does Apertium support separable verbs?
Many languages, for example most Germanic ones (with the exception of English) and Hungarian have a phenomenon called "separable verbs", also referred to as "attached prepositions" or some other names. This is where the infinitive of the verb has a part that when conjugated detaches and is moved. For example in Afrikaans, the verb for "to announce" is "aankondig". The aan part separates when the verb is conjugated, so for example:
- Astronomers announce [the discovery].
- Sterrekundiges kondig [die ontdekking] aan.
However, the past tense would be:
- Astronomers have announced [the discovery].
- Sterrekundiges het [die ontdekking] aangekondig.
On its own, "kondig" does not mean anything.
- Answer
Essentially no, at the moment we do not support separable verbs. The problem for Apertium comes when the unseparated part does mean something, it is currently impossible to analyse a word in two parts when they are separated by something as nebulous as a noun-phrase (NP). There are a number of hacks that can be tried to get around this deficiency, but none of them work properly. If you would like more information on this, or have ideas how it might be fixed or dealt with, please see our page on Separable verbs.
[edit] How can I contribute to this project?
See Contributing.

