Romance languages

From Apertium
Revision as of 01:42, 24 April 2014 by Sushain (talk | contribs) (→‎Status)
Jump to navigation Jump to search

The Romance languages (Wikipedia:Romance languages) include Catalan (ca), Occitan (oc), Asturian (ast), Spanish (es), French (fr), Galician (gl), Portuguese (pt), Romanian (ro) and Italian (it). The languages are related with varying levels of mutual intelligibility. Many of these languages are included in Apertium already.

Romance languages that are not yet covered in Apertium include Aromanian, Arpitan, Corsican, Friulan, Ladino, Leonese, Lombard, Mirandese, Neapolitan, Piedmontese, Romansh, Sicilian, Venetian and Walloon.


The ultimate goal is to have multi-purposable transducers for a variety of Romance languages. These can then be paired for X→Y translation with the addition of a CG for language X and transfer rules / dictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs.

Table of existing pairs

Text in italics denotes language pairs in the incubator. Regular text denotes a developing language pair in nursery, while text in bold denotes a stable well-working language pair in trunk and text in bold and italics denotes a pair in staging. Bidix stems as counted with dixcounter are displayed below.

arg ast cat cos spa fra glg ita oci por ron rup srd
arg - 'es-an
ast - 'es-ast
cat - cat-cos
cos cat-cos
spa 'es-an
- 'fr-es
fra 'fr-ca
- fr-it
glg 'es-gl
- 'pt-gl
ita 'ca-it
- it-pt
oci 'oc-ca
por 'pt-ca
- sc-pt
ron ca-ro
- ron-rup
rup ron-rup
srd cat-srd
bre br-es
ces es-cs
cym cy-es
deu es-de
eng 'en-ca
epo 'eo-ca
eus 'eu-es
guc guc-spa
ina es-ia
lat la-es
mlt mlt-spa
nld fr-nl
quz quz-spa
qve spa-qve
slv slv-spa
sme sme-spa
ssp es-ssp
tet tet-por
zho zho-spa

Many of these are documented in Publications.


Article 1 of the Universal Declaration of Human Rights:

All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

Language Text
Italian Tutti gli esseri umani nascono liberi ed eguali in dignità e diritti. Essi sono dotati di ragione e di coscienza e devono agire gli uni verso gli altri in spirito di fratellanza.
Venetian Tuti i essari Umani nasse liberi e uguaƚi in teƚa dignità e diriti. I xe dotai de raxón e de cosiensa e i gà da agire cò spirito de fraternità lun l’altro.
French Tous les êtres humains naissent libres et égaux en dignité et en droits. Ils sont doués de raison et de conscience et doivent agir les uns envers les autres dans un esprit de fraternité.
Picard Tos lès-omes vinèt å monde lîbes èt égåls po çou qu'èst d' leû dignité èt d' leûs dreûts. Leû re°zon èt leû consyince elzî fe°t on d'vwér di s'kidûre inte di zèle come dès frès
Walloon Tos lès-omes vinèt-st-å monde lîbes, èt so-l'minme pîd po çou qu'ènn'èst d'leu dignité èt d'leus dreûts. I n'sont nin foû rêzon èt-z-ont-i leû consyince po zèls, çou qu'èlzès deût miner a s'kidûre onk' po l'ôte tot come dès frés.
Friulian Ducj i oms a nassin libars e compagns come dignitât e derits. A an sintiment e cussience e bisugne che si tratin un culaltri come fradis.
Romansch Tuots umans naschan libers ed eguals in dignità e drets. Els sun dotats cun intellet e conscienza e dessan agir tanter per in uin spiert da fraternità.
Catalan-Valencian-Balear Tots els éssers humans neixen lliures i iguals en dignitat i en drets. Són dotats de raó i de consciència, i han de comportar-se fraternalment els uns amb els altres.
Asturian Tolos seres humanos nacen llibres y iguales en dignidá y drechos y, pola mor de la razón y la conciencia de so, han comportase hermaniblemente los unos colos otros.
Ladino Todos los umanos nasen libres i iguales en dinyidad i derechos i, komo estan ekipados de razon i konsensia, deven komportarsen kon ermandad los unos kon los otros.
Spanish Todos los seres humanos nacen libres e iguales en dignidad y derechos y, dotados como están de razón y conciencia, deben comportarse fraternalmente los unos con los otros.
Galician Tódolos seres humanos nacen libres e iguais en dignidade e dereitos e, dotados como están de razón e conciencia, díbense comportar fraternalmente uns cos outros.
Portuguese Todos os seres humanos nascem livres e iguais em dignidade e em direitos. Dotados de razão e de consciência, devem agir uns para com os outros em espírito de fraternidade.
Corsican Nascinu tutti l’omi libari è pari di dignità è di diritti. Pussedinu a raghjoni è a cuscenza è li tocca ad agiscia trà elli di modu fraternu.
Sardinian, Logudorese Totu sos èsseres umanos naschint lìberos e eguales in dinnidade e in deretos. Issos tenent sa resone e sa cussèntzia e depent operare s'unu cun s'àteru cun ispìritu de fraternidade.


This table summarizes the vulnerability of various Romance languages. Vulnerability data is derived from the ‘Atlas of the World’s Languages in Danger, © UNESCO,’ and Ethnologue.

Language ISO639-3 Location Speakers Status
Ethnologue UNESCO
Zarphatic zrp France 0 10 (Extinct) -
Shuadit sdt France 0 10 (Extinct) -
Emilian egl Italy 0 9 (Dormant) -
Romagnol rgn Italy 0 9 (Dormant) -
Minderico drc Portugal 500 8b (Nearly extinct) -
Judeo-Italian itk Italy 250 8a (Moribund) -
Arpitan frp France & Italy 137,000 8a (Moribund) 2 (Definitely endangered)
Romanian, Istro ruo Croatia 560 7 (Shifting) 3 (Severely endangered)
Istriot ist Croatia 1,000 7 (Shifting) 3 (Severely endangered)
Romanian, Megleno ruq Greece, Macedonia 5,000 7 (Shifting) 3 (Severely endangered)
French, Cajun frc United States 25,600 7 (Shifting) -
Extremaduran ext Spain 201,500 7 (Shifting) -
Aragonese arg Spain 10,000 6b (Threatened) 2 (Definitely endangered)
Ladin lld Italy 20,000 6b (Threatened) 2 (Definitely endangered)
Sardinian, Gallurese sdn Italy 100,000 6b (Threatened) 2 (Definitely endangered)
Sardinian, Sassarese sdc Italy 100,000 6b (Threatened) 2 (Definitely endangered)
Asturian ast Spain 110,000 6b (Threatened) -
Aromanian rup Albania, Bulgaria, Greece, Macedonia, Serbia 123,300 6b (Threatened) 2 (Definitely endangered)
Sardinian, Logudorese src Italy 500,000 6b (Threatened) 2 (Definitely endangered)
Walloon wln Belgium, France, Luxembourg 600,000 6b (Threatened) 2 (Definitely endangered)
Spanish, Loreto-Ucayali spq Peru 2,800 6a (Vigorous) -
Fala fax Spain 10,500 6a (Vigorous) -
Sardinian, Campidanese sro Italy 500,000 6a (Vigorous) 2 (Definitely endangered)
Corsican cos France, Italy 31,000 5 (Developing) 2 (Definitely endangered)
Picard pcd Belgium, France 200,000 5 (Developing) 3 (Severely endangered)
Friulian fur Italy 300,000 5 (Developing) 2 (Definitely endangered)
Ligurian lij France, Italy, Monaco 505,100 5 (Developing) 2 (Definitely endangered)
Piemontese pms Italy 1,600,000 5 (Developing) 2 (Definitely endangered)
Lombard lmo Italy 3,903,000 5 (Developing) 2 (Definitely endangered)
Sicilian scn Italy 4,700,000 5 (Developing) 1 (Vulnerable)
Napoletano-Calabrese nap Italy 5,700,000 5 (Developing) 1 (Vulnerable)
Romansch roh Switzerland 35,139 4 (Educational) 2 (Definitely endangered)
Ladino lad Israel & Albania, Algeria, Bosnia and Herzegovina, Bulgaria, Croatia, Greece, Macedonia, Morocco, Romania, Turkey, Serbia 112,130 4 (Educational) 3 (Severely endangered)
Occitan oci France, Italy 2,048,310 4 (Educational) 2 (Definitely endangered)
Venetian vec Croatia, Italy, Slovenia 3,852,500 4 (Educational) 1 (Vulnerable)
Mirandese mwl Portugal 15,000 2 (Provincial) -
Galician glg Spain 3,185,000 2 (Provincial) -
Catalan cat Spain & Italy 7,220,420 2 (Provincial) 2 (Definitely endangered)
Romanian ron Romania 23,623,890 1 (National) -
Italian ita Italy 61,068,677 1 (National) -
French fra France 68,458,600 1 (National) 3 (Severely endangered)
Portuguese por Portugal 202,468,100 1 (National) -
Spanish spa Spain 405,638,110 1 (National) -

Other language pairs

Pairs including a non-Romance language


Funding possibilities


See also