Difference between revisions of "Catalan"

From Apertium
Jump to navigation Jump to search
Line 106: Line 106:
* Restore gender and number to proper nouns that still do not have them.
* Restore gender and number to proper nouns that still do not have them.
* Tweak entries related to proper nouns with translations (kings, queens, etc.).
* Tweak entries related to proper nouns with translations (kings, queens, etc.).
**This is already partially done using lexical selection and a specific macro in several language pairs





Revision as of 14:14, 5 October 2019

Catalan (Wikipedia:Catalan language) is a Romance language. It is available in Apertium as a standalone analyser/generator (apertium-cat) and as a component of several pairs which translate to/from Catalan.

Language pairs

See also: List of language pairs

In trunk:

Pair name Languages Last update
apertium-arg-cat Aragonese <-> Catalan 17 Aug 2016
apertium-ca-it Catalan <-> Italian 12 Oct 2014
apertium-en-ca English <-> Catalan 28 Mar 2016
apertium-eo-ca Esperanto <-- Catalan 13 Dec 2015
apertium-fra-cat French <-> Catalan 18 Apr 2017
apertium-oc-ca Occitan <-> Catalan 13 Dec 2015
apertium-pt-ca Portuguese <-> Catalan 13 Dec 2015
apertium-spa-cat Spanish <-> Catalan 01 Apr 2017

In staging:

Pair name Languages Last update
apertium-cat-glg Catalan <-> Galician 18 Nov 2016
apertium-cat-srd Catalan <-> Sardinian 03 Apr 2017

In nursery:

Pair name Languages Last update
apertium-ca-ro Catalan <-> Romanian 30 Sep 2015

In incubator:

Pair name Languages Last update
apertium-cat-cos Catalan <-> Corsican 05 Nov 2013
apertium-cat-ina Catalan <-> Interlingua 07 Jan 2016
apertium-eng-cat English <-> Catalan 24 Jan 2016
apertium-por-cat Portuguese <-> Catalan 22 Jan 2016

Apertium-cat

Current status

Last update: 28 Aug 2017

Dix entries: 56,588

Dix paradigms: 607

Coverage: 94.04% (Wikipedia)

Dictionary guidelines

The current Catalan dictionary is quite big (more than 55,000 entries), so tidiness is essential to ensure future development:

  • Keep entries sorted alphabetically.
  • Keep entries grouped by type and tags (do not mix different types of proper nouns together).
  • Check the file with apertium-dixtools (to update the number of entries and remove duplicates).

Proper nouns

Catalan proper nouns (names, toponyms, acronyms, etc.) should all have gender and number. They were once removed, but they should be specified using the following paradigms:

  • Toponyms <np><top><m><sg>: Iran__np
  • Toponyms <np><top><f><sg>: Àfrica__np
  • Toponyms <np><top><m><pl>: Estats_Units__np
  • Toponyms <np><top><f><pl>: Balears__np
  • Antroponyms <np><ant><m><sg>: Marc__np
  • Antroponyms <np><ant><f><sg>: Maria__np
  • Family names <np><cog><mf><sp>: Saussure__np
  • Others <np><al><m><sg>: Linux__np
  • Others <np><al><f><sg>: Wikipedia__np
  • Others <np><al><m><pl>: Jocs_Olímpics__np
  • Others <np><al><f><pl>: Falles__np
  • Others <np><al><mf><sp>: Honda__np

Future work

  • Add support for proper noun articles (en/na).
  • Restore gender and number to proper nouns that still do not have them.
  • Tweak entries related to proper nouns with translations (kings, queens, etc.).
    • This is already partially done using lexical selection and a specific macro in several language pairs


For further documentation about Catalan in Apertium, check: Category:Catalan