Difference between revisions of "Dialectal or standard variation"

From Apertium
Jump to navigation Jump to search
m
Line 16: Line 16:
 
[[Category:Development]]
 
[[Category:Development]]
 
[[Category:Documentation]]
 
[[Category:Documentation]]
 
 
[[Category:Writing dictionaries]]
 
[[Category:Writing dictionaries]]
  +
[[Category:Documentation in English]]

Revision as of 18:26, 3 September 2011

Some languages have differences in lexis and grammar, but are still desirable to be treated as one side of a language pair, as either they have a largely similar orthography and lexis, or for historical reasons.

For example:

  • Portuguese, Brazilian Portuguese
  • Occitan, Aranese
  • Serbo-Croatian (Bosnia, Croatia, Serbia)

The languages are so similar that duplicating the work in many separate systems is wasteful. There are a couple of approaches that have been taken, both relying on an intermediate dictionary and transfer format which is then converted by an xsl stylesheet into the "real" .dix files.

The first is filter.xsl, which is used for the apertium-es-pt pair. The second is aversion.xsl, which is used with the apertium-oc-ca pair. Neither of these is really appropriate for marking variants though, so we could do with something more sophisticated.

See also

Unification of metadix and parametrized dictionaries on variants in monodix and transfer rules