Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Dialectal or standard variation

From Apertium
Jump to navigation Jump to search

Some languages have differences in lexis and grammar, but are still desirable to be treated as one side of a language pair, as either they have a largely similar orthography and lexis, or for historical reasons.

For example:

  • Portuguese, Brazilian Portuguese
  • Occitan, Aranese
  • Serbo-Croatian (Bosnia, Croatia, Serbia)

The languages are so similar that duplicating the work in many separate systems is wasteful. There are a couple of approaches that have been taken, both relying on an intermediate dictionary and transfer format which is then converted by an xsl stylesheet into the "real" .dix files.

The first is filter.xsl, which is used for the apertium-es-pt pair. The second is aversion.xsl, which is used with the apertium-oc-ca pair. Neither of these is really appropriate for marking variants though, so we could do with something more sophisticated.

See also[edit]

Unification of metadix and parametrized dictionaries on variants in monodix and transfer rules