Difference between revisions of "Dialectal or standard variation"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
Some languages have differences in lexis and grammar, but are still desirable to be treated as one language, as either they have a largely similar orthography, or for historical reasons.
+
Some languages have differences in lexis and grammar, but are still desirable to be treated as one side of a language pair, as either they have a largely similar orthography and lexis, or for historical reasons.
   
 
For example:
 
For example:

Revision as of 10:16, 7 October 2007

Some languages have differences in lexis and grammar, but are still desirable to be treated as one side of a language pair, as either they have a largely similar orthography and lexis, or for historical reasons.

For example:

  • Portuguese, Brazilian Portuguese
  • Occitan, Aranese
  • Serbo-Croatian (Bosnia, Croatia, Serbia)

The languages are so similar that duplicating the work in many separate systems is wasteful. There are a couple of approaches that have been taken, both relying on an intermediate dictionary and transfer format which is then converted by an xsl stylesheet into the "real" .dix files.

The first is filter.xsl, which is used for the apertium-es-pt pair. The second is aversion.xsl, which is used with the apertium-oc-ca pair. Neither of these is really appropriate for marking variants though, so we could do with something more sophisticated.