Dialectal or standard variation
Some languages have differences in lexis and grammar, but are still desirable to be treated as one side of a language pair, as either they have a largely similar orthography and lexis, or for historical reasons.
- Portuguese, Brazilian Portuguese
- Occitan, Aranese
- Serbo-Croatian (Bosnia, Croatia, Serbia)
The languages are so similar that duplicating the work in many separate systems is wasteful. There are a couple of approaches that have been taken, both relying on an intermediate dictionary and transfer format which is then converted by an xsl stylesheet into the "real" .dix files.
The first is
filter.xsl, which is used for the
apertium-es-pt pair. The second is
aversion.xsl, which is used with the
apertium-oc-ca pair. Neither of these is really appropriate for marking variants though, so we could do with something more sophisticated.
Unification of metadix and parametrized dictionaries on variants in monodix and transfer rules