Difference between revisions of "Siciliano y castellano/Informe final"

From Apertium
Jump to navigation Jump to search
Line 25: Line 25:
 
The following cases were handled with CG rules in the Sicilian package.
 
The following cases were handled with CG rules in the Sicilian package.
   
* '''Disambiguation within one part of speech.''' The coincidence of verb forms within one verb paradigm occurs fairly often in Sicilian language. For instance, regular verbs of the 2-nd conjugation have the same forms for Present Indicative of the first and the second person, Present Indicative of the third person singular usually coincides with the Imperative of the second person plural by verbs of the 1-st conjugation. All verbs demonstrate coinciding forms for first, second and third forms of Present Subjunctive.
+
* '''Disambiguation within one part of speech.''' The coincidence of verb forms within one verb paradigm occurs fairly often in Sicilian language. For instance, all Sicilian verbs demonstrate coinciding forms for first, second and third forms of Present Subjunctive. Regular verbs of the 2-nd conjugation have the same forms for Present Indicative of the first and the second person, Present Indicative of the third person singular usually coincides with the Imperative of the second person plural by verbs of the first conjugation.
   
* '''Disambiguation between words of different categories'''. For example, a masculine noun "munnu" y Present Indicative form "munnu" of the verb "munnari".
+
* '''Disambiguation between words of different categories'''. Since "-a", "-i" and "-u" are standard endings for Sicilian nouns, adjectives, and word forms, there are much more ambiguous wordforms in Sicilian than one can expect. For example, a masculine noun "munnu" coincides with Present Indicative form "munnu" of the verb "munnari".
  +
Conversion as word formation is often as well
   
   
A very specific case is the coincidence of prepositions and relative
+
A very specific case is the coincidence of prepositions and relative pronouns. Here are the [pending tests http://wiki.apertium.org/wiki/Siciliano_y_castellano/Pending_tests#Frases_relativas] for the set of CG rules.
Here are the pending tests for the rule
 
   
   

Revision as of 14:57, 22 August 2016

Commitment

The list of all commits: https://apertium.projectjj.com/gsoc2016/uliana-sentsova.html

Monolingual Sicilian dictionary:

Bilingual Sicilian-Spanish dictionary: https://svn.code.sf.net/p/apertium/svn/incubator/apertium-scn-spa/


Description

1. Sicilian language TODO

2. Project goals

TODO

corpus, coverage, testvoc, pending tests

Constraint grammar


Constraint Grammar rules allow us to distinguish words with different grammatical tags and words with different lexical meanings based on the grammatical and lexical context. CG rules work both for disambiguation within one part of speech and between words of different categories.

The following cases were handled with CG rules in the Sicilian package.

  • Disambiguation within one part of speech. The coincidence of verb forms within one verb paradigm occurs fairly often in Sicilian language. For instance, all Sicilian verbs demonstrate coinciding forms for first, second and third forms of Present Subjunctive. Regular verbs of the 2-nd conjugation have the same forms for Present Indicative of the first and the second person, Present Indicative of the third person singular usually coincides with the Imperative of the second person plural by verbs of the first conjugation.
  • Disambiguation between words of different categories. Since "-a", "-i" and "-u" are standard endings for Sicilian nouns, adjectives, and word forms, there are much more ambiguous wordforms in Sicilian than one can expect. For example, a masculine noun "munnu" coincides with Present Indicative form "munnu" of the verb "munnari".

Conversion as word formation is often as well


A very specific case is the coincidence of prepositions and relative pronouns. Here are the [pending tests http://wiki.apertium.org/wiki/Siciliano_y_castellano/Pending_tests#Frases_relativas] for the set of CG rules.


A good example is the Sicilian noun"cristianu" that not only signifies a person of Christian faith but can also denote a human being in general.

Transfer rules

Transfer rules help to translate correctly syntactic differences between languages that cannot be translated directly. There are 40 transfer rules in total.

  • Unlike in Spanish, the synthetic future is no longer in use in Sicilian language, therefore it is replaced by the periphrastic compound forms with common verbs like "jiri", "vèniri" or "aviri".
  • The synthetic conditional forms of verbs are normally replaced by indicative or subjunctive forms.
  • There are also transfer rules to translate the verb construction with passive and modal meaning.


Statistics

Coverage Sicilian-castellano (%) Castellano-siciliano (%)
Trimmed coverage 83.4% %
Coverage Sicilian (%) Spanish (%)
Raw coverage' 85.5% 91,6%

The number of lemmas in bilingual dictionary: 11,253.

The number of lemmas in Sicilian dictionary:


Challenging issues

1. Abundance of spelling forms

2. Accent system

3. Pronouns

examples (jardinu / iardinu / giardinu = ‘garden’, palora / parola /paràula /palàura = ‘word’).

cunjùnciri, cognùngiri, conjùngiri, cugnùnciri, cognùncici, coniùngiri, conjùnciri

TODO

Future work

Syntactic properties, more rules, automatic forms merge algorithm TODO

Resources

https://scn.wikipedia.org/wiki/P%C3%A0ggina_principali

https://scn.wiktionary.org/wiki/P%C3%A0ggina_principali

Bonner, Introduction to Sicilian Grammar

El nuovo dizionario siciliano-italiano