Grfro3d/proposal apertium cat-srd and ita-srd

From Apertium
Jump to navigation Jump to search

Contact Information

Name: Gianfranco Fronteddu

Location: Casteddu, Sardigna

E-mail: gfro3d@gmail.com

IRC: gianfranco

SourceForge: gfro3d

Telegram: gianfro4moros

Skype: gianfranco.fronteddu88

Why is it you are interested in machine translation?

I’m a Translation student and have always been fascinated by Computational Linguistics during my University studies. We have approached to this field of Linguistics through the courses “Theory and techniques of translating” and “Applied linguistics” at the University of Cagliari. As a support to the translation, the MT are divided into two groups: one is MT which is used for "assimilation" —the use of machine translation to understand the general meaning of the text in foreign language. The other approach is instead that of "dissemination" in which the MT is an intermediate step in the production of a document in the TL, which will be published. To facilitate this process, it is usual to adopt the controlled languages, namely to establish the phrases with structures not too complex. An important aspect is free/opensource, which allows using for any purpose software and examining it and then adapting it for the creation of new applications. Therefore, open-source software can be redistributed and improved. Open-source RBMT is, then, very useful for language, thanks to the creation of morphological data such as dictionaries, bilingual dictionaries, grammars and rules and structural transfer files. RBMT systems consist of an engine (coding and decoding), data (linguistic data) and support tools to convert data and make them compatible with the engine. Even if most RBMT systems are private and are born for commercial purposes, open-source RBMT offer the possibility of being able to take advantage of the engine MT, but also to be able to intervene on the code to modify and change the rules. Finally, the advantages of creating a RBMT system are above all the increase of linguistic resources: information collected for the development of a machine translation is easily reusable for other projects and related technologies. In this way, MT can truly become a good support especially for minorised languages in danger of extinction.

Why is it that you are interested in Apertium?

The fact that Apertium is an open-source project means that anyone can contribute to its development. This brings about an interesting point related to the involvement of minoritised language communities. Being myself a speaker of a minoritised language, Sardinian, I would like to give my contribution so that my language can become part of the language combinations offered by this tool. Sardinian is a Romance language deriving from Latin spoken in the island of Sardinia. The Sardinian language is a romance language spoken by approximately one million people in the island of Sardinia. According to Ethnologue, unfortunately, the Sardinian language is in danger of extinction. The linguistic fragmentation and differences between the various dialects have led to a gradual abandonment of Sardinian in favor of the national language, Italian. It resists as the primary language only in some areas of Sardinia, for example, the central ones. The UNESCO Atlas of the World's Languages in Danger (http://www.unesco.org/languages-atlas/index.php). The Limba Sarda Comuna (LSC) has been proposed as the standard form for all varieties of Sardinian. It is an evolved version of the Limba Sarda Unificada (LSU), which was in turn the result of an experts' committee called by the Sardinian government in 2001. In 2006, the Sardinian government adopted the LSC as a co-official language for the publication of official documents. The LSC is also the form chosen by several publishing houses, journals and online sites. However, other romance languages such as Tabarchino Ligurian (in the islands of San Pé and Sant'Antióccu), Algherese Catalan (in the city of L'Alguer), Sassarese (in the city of Sassari) and Gallurese Corsican (in Gaddùra) are spoken in Sardinia. The Sardinian language and other minoritized language of Sardinia are recognised by the regional government's law n. 26 of 15 October 1997 [1] and by the Italian constitution (according to Article 6, "La Repubblica tutela con apposite norme le minoranze linguistiche"{"The Republic safeguards linguistic minorities by means of appropriate measures", Law n. 482 of 15 December 1999 [http://www.camera.it/parlam/leggi/99482l.html, "Rules on protection of historical linguistic minorities", makes it possible for regional governments to use local languages at school. Catalan is spoken in the sardinian city of Alghero by about 33,000 speakers, 8,600 active and 25,000 passive. According to a study of the Generalitat de Catalunya, in Alghero Catalan is understood by 60% of the population, while it is only spoken by 20%. The dominant presence of Catalan in Alghero dates back to the XIV century with the expulsion of the Sardinian populations by the hand of the Aragon Catalans. Later, in Sardinia, Catalan assumed a position of prestige. In 1952, Rafael Sari founded the Center d'Estudis Algueresos for the dissemination and teaching of the Catalan language standards in Sardinia. Among the important people who directed the institute are Rafael Catardi and Antoni Simon Mossa. The "Escola de Alguerés Pascual Scanu" was founded by Josep Sanna and offers courses of catalan language and literature. Among the most important magazines there is L'Alguer, published only in Catalan. I'm interested Apertium because is an OpenSource platform and because it is suitable for similar romance languages. In this case Cat-Srd and Ita>srd are perfect cases that meet this requirement. Given the influence these languages they have had between themsel, it is easy to note that the Sardinian language still presents in many aspects the influence of the Catalan language, because of their coexistence during the period of the Catalan-Aragonese occupation Sardinia. The same goes for the language pair co-eng and, in this case, the similarity is greater. All these languages are part of the Sardinian linguistic heritage. This project would give speakers a valuable tool to improve their skills in standard language and to create new bridges between Sardinia and the linguistic and socio-political realities.