Difference between revisions of "Maltese and Hebrew"
(17 intermediate revisions by 5 users not shown) | |||
Line 2: | Line 2: | ||
Maltese and Hebrew |
Maltese and Hebrew |
||
clone with: |
|||
<pre> |
<pre> |
||
git clone https://github.com/apertium/apertium-mlt-heb.git |
|||
</pre> |
</pre> |
||
Line 16: | Line 16: | ||
# <s>Generate Hebrew verb entries from hspell output in Apertium format</s> |
# <s>Generate Hebrew verb entries from hspell output in Apertium format</s> |
||
# <s>Add existing verbs to the bidix</s> |
# <s>Add existing verbs to the bidix</s> |
||
# Look at how attached/clitic pronouns can be treated (Spanish,Catalan,Italian have similar requirements) |
# <s>Look at how attached/clitic pronouns can be treated (Spanish,Catalan,Italian have similar requirements)</s> |
||
# Add high frequency verbs to the full form list / Maltese analyser |
# <s>Add high frequency verbs to the full form list / Maltese analyser</s> |
||
# Add high frequency nouns, adjectives, adverbs to Maltese analyser |
# <s>Add high frequency nouns, adjectives, adverbs to Maltese analyser</s> |
||
# Align Maltese and Hebrew bibles |
# Align Maltese and Hebrew bibles |
||
# Add attached articles to all noun entries -- not 100% sure how this works, see for ex. <code> <nowiki> <e lm="הולדת"><par n="ה"/><i>הולדת</i><par n="עת__n_f"/></e> </nowiki></code> |
# <s>Add attached articles to all noun entries -- not 100% sure how this works, see for ex. <code> <nowiki> <e lm="הולדת"><par n="ה"/><i>הולדת</i><par n="עת__n_f"/></e> </nowiki></code></s> |
||
# <s>Wiktionary script: 1) Fix URL encoding 2) Output English gloss as a comment</s> |
# <s>Wiktionary script: 1) Fix URL encoding 2) Output English gloss as a comment</s> |
||
# Change mt_verbs.py to generate negation circumfixes for all verbs: ma(--)x or m'(--)x generates <neg> |
# <s>Change mt_verbs.py to generate negation circumfixes for all verbs: ma(--)x or m'(--)x generates <neg></s> |
||
# <s>mt_verbs.py: Ability to set alternative forms with a direction restriction, e.g. sp['past.p1.sg'] = ('inkun', 'LR')</s> |
# <s>mt_verbs.py: Ability to set alternative forms with a direction restriction, e.g. sp['past.p1.sg'] = ('inkun', 'LR')</s> |
||
# Possessive suffixes on Maltese nouns. p.100 "Teach yourself" |
# Possessive suffixes on Maltese nouns. p.100 "Teach yourself" |
||
Line 31: | Line 31: | ||
# Rule in verb prefixes for ''i-'' and ''j-'' |
# Rule in verb prefixes for ''i-'' and ''j-'' |
||
# Past participles have gender/number -- like adjectives |
# Past participles have gender/number -- like adjectives |
||
# Fix global generation of negative forms: bela'x should be belagħx (depends on root), etc. |
|||
# 619 noun forms with {{tag|GD}}, 81 with the tag {{tag|ND}} -- find the gender and number |
|||
# 38 adjective forms with {{tag|GD}} -- find the gender. |
|||
# Hebrew overgeneration of nouns - caused by my wrong handling of plural noun forms. Figure out what the source of this is - is it hspell? (contact hspell people). Is it me? (regenerate from hspell and figure out whether to fix it) |
|||
===Midterm sprint=== |
===Midterm sprint=== |
||
Line 64: | Line 67: | ||
| 13 || 8th July || 250 || || 74.32% || 74.50% || |
| 13 || 8th July || 250 || || 74.32% || 74.50% || |
||
|- |
|- |
||
| 14 || 9th July || 250 || || |
| 14 || 9th July || 250 || || 75.05% || 75.34% || |
||
|- |
|||
| 15 || 10th July || 250 || || 76.76% || 77.69% || |
|||
|- |
|||
| 16 || 11th July || 250 || || — || — || |
|||
|- |
|||
| 17 || 12th July || 250 || || 77.50% || 78.40% || |
|||
|- |
|||
| 18 || 13th July || 250 || || 78.16% || 79.12% || |
|||
|- |
|- |
||
| |
| 19 || 14th July || 250 || || || || |
||
|- |
|- |
||
| 20 || 15th July || 250 || || || || |
|||
|} |
|} |
||
Line 148: | Line 160: | ||
===Resources on verbs=== |
===Resources on verbs=== |
||
''See also: [[Maltese]]'' |
|||
# [http://aboutmalta.com/language/maltesegrammar.htm#ver Maltese Grammar] |
|||
# [http://en.wiktionary.org/wiki/Category:Maltese_verbs Maltese verbs on Wiktionary] |
|||
# [http://en.wiktionary.org/wiki/Category:Maltese_conjugation-table_templates Maltese conjugation tables on Wiktionary] |
|||
# [http://wiki.verbix.com/Languages/Maltese Maltese verbs on Verbix] |
|||
# [http://www.phil-fak.uni-duesseldorf.de/summerschool2002/Hoberman.pdf The verbal morphology of Maltese, Robert D. Hoberman and Mark Aronoff] |
|||
# [http://www.grammaticalframework.org/doc/gfss/status-john.pdf GF Summer School: Progress in Maltese] |
|||
# [http://spraakbanken.gu.se/eng/publikationer/verb-morphology-hebrew-and-maltese-towards-open-source-type-theoretical-resource Dana Dannélls, John J. Camilleri 2010. Verb Morphology of Hebrew and Maltese - Towards an Open Source Type Theoretical Resource Grammar in GF] |
# [http://spraakbanken.gu.se/eng/publikationer/verb-morphology-hebrew-and-maltese-towards-open-source-type-theoretical-resource Dana Dannélls, John J. Camilleri 2010. Verb Morphology of Hebrew and Maltese - Towards an Open Source Type Theoretical Resource Grammar in GF] |
||
⚫ | |||
# [http://books.google.com/books?id=gaktTQ8vq28C&lpg=PA261&ots=g3l96VhYqd&dq=sigra%20sigriet&hl=ca&pg=PA257#v=onepage&q=sigra%20sigriet&f=false Morphologies of Asia and Africa: Maltese Morphology] |
|||
==Numerals== |
==Numerals== |
||
Line 173: | Line 178: | ||
==External resources== |
==External resources== |
||
* [http://mt.w3dictionary.org/index.php w3dictionary], a wordnet? |
* [http://mt.w3dictionary.org/index.php w3dictionary], a wordnet? |
||
* [http://mymemory.translated.net/ MyMemory.translated.net], search TMX'es |
|||
* [http://open-tran.eu/ Open-Tran.eu], search FOSS translations |
|||
* [https://secure.wikimedia.org/wiktionary/mt/wiki/Il-Pa%C4%A1na_prin%C4%8Bipali mt Wiktionary] |
|||
⚫ | |||
==See also== |
==See also== |
Latest revision as of 04:41, 9 March 2018
Maltese and Hebrew
clone with:
git clone https://github.com/apertium/apertium-mlt-heb.git
Todo list[edit]
Make program to generate a full form list for a given Maltese verb stemAdd closed categories to the Maltese analyser (prepositions, conjunctions,pronouns, determiners, numerals)Add closed categories to the bidixAdd closed categories to the Hebrew dictionary (prepositions, pronouns, determiners, numerals, conjunctions)Generate Hebrew verb entries from hspell output in Apertium formatAdd existing verbs to the bidixLook at how attached/clitic pronouns can be treated (Spanish,Catalan,Italian have similar requirements)Add high frequency verbs to the full form list / Maltese analyserAdd high frequency nouns, adjectives, adverbs to Maltese analyser- Align Maltese and Hebrew bibles
Add attached articles to all noun entries -- not 100% sure how this works, see for ex.<e lm="הולדת"><par n="ה"/><i>הולדת</i><par n="עת__n_f"/></e>
Wiktionary script: 1) Fix URL encoding 2) Output English gloss as a commentChange mt_verbs.py to generate negation circumfixes for all verbs: ma(--)x or m'(--)x generates <neg>mt_verbs.py: Ability to set alternative forms with a direction restriction, e.g. sp['past.p1.sg'] = ('inkun', 'LR')- Possessive suffixes on Maltese nouns. p.100 "Teach yourself"
- This has been started, see:
Completed paradigms above this line
- Find out if all nouns can take these (in theory), or only certain classes of nouns.
- This has been started, see:
- Write a script that scans the corpus for adjective/noun pairs, looking for gender agreement errors.
- Many verbs with the j- prefix can also take an i- prefix or a ø- null prefix, this should be included into each of the class files. (E.g. strong_double_mid_radical.py)
- Rule in verb prefixes for i- and j-
- Past participles have gender/number -- like adjectives
- Fix global generation of negative forms: bela'x should be belagħx (depends on root), etc.
- 619 noun forms with
<GD>
, 81 with the tag<ND>
-- find the gender and number - 38 adjective forms with
<GD>
-- find the gender. - Hebrew overgeneration of nouns - caused by my wrong handling of plural noun forms. Figure out what the source of this is - is it hspell? (contact hspell people). Is it me? (regenerate from hspell and figure out whether to fix it)
Midterm sprint[edit]
Day | Date | Target | Achieved | Cov. WP | Cov. KPS | Notes |
---|---|---|---|---|---|---|
1 | 26th June | 250 | 154 | 53.95% | First batch (top 120) of adjectives from Unhammer's adj_suspected.txt | |
2 | 27th June | 250 | 411 | 55.19% | 52.15% | Batches 2,3,4 from same list (next 440 items) |
3 | 28th June | 250 | 210 | 55.38% | Rest of adj_suspected-ku-ka-ci.txt, adj_suspected.txt | |
4 | 29th June | 250 | 218 | 60.59% | Nouns from Wiktionary script, some high frequency words (det, prep) | |
5 | 30th June | 250 | 248 | 63.20% | 58.82% | Some nouns, adjectives, adverbs |
6 | 1st July | 250 | 64.38% | 60.07% | ||
7 | 2nd July | 250 | 67.65% | 66.23% | ||
8 | 3rd July | 250 | 70.88% | 70.02% | ||
9 | 4th July | 250 | — | — | ||
10 | 5th July | 250 | 72.13% | 71.34% | ||
11 | 6th July | 250 | 72.62% | 72.71% | ||
12 | 7th July | 250 | 73.85% | 74.04% | ||
13 | 8th July | 250 | 74.32% | 74.50% | ||
14 | 9th July | 250 | 75.05% | 75.34% | ||
15 | 10th July | 250 | 76.76% | 77.69% | ||
16 | 11th July | 250 | — | — | ||
17 | 12th July | 250 | 77.50% | 78.40% | ||
18 | 13th July | 250 | 78.16% | 79.12% | ||
19 | 14th July | 250 | ||||
20 | 15th July | 250 |
Maltese verbs[edit]
No infinitive. Stem is third person singular, masculine perfect tense.
Second verb infinitive does not exist, instead both verbs are conjugated. "I want to eat" = "I want I eat"
A verbal stem can consist of:
- Three consonants (radicals) with the medial radical between one of six vowel combinations. (Triliteral)
- kiteb
- Four consonants, some having two repeated biradical bases. (Quadriliteral)
- Two consonants, or a consonant and a semivowel
In verbs with 'għ' or a + 'j' as the third radical, only have the first two radicals in the stem word which ends in 'a' (open syllable).
- Verbs that have three non-semivocalic consonants are called sound or strong verbs.
- Verbs that have three radicals, with the last radical being 'għ' or 'j' are called defective or weak verbs.
- Triliteral verbs with long 'a' or 'ie' between 1st and 2nd radicals are called hollow verbs.
- Triliteral verbs with where the second and third radicals are the same are called doubled or geminated verbs.
Examples:
Type | Example | Cons | Vowel config | Translation |
---|---|---|---|---|
Sound (Tri) | ħareġ | ħ·r·ġ | 2. a·e | he went out |
Sound (Quad) | ħarbex | ħ·rb·x | 2. a·e | he scribbled |
Defective | qata' | q·t·għ | 1. a·a | he cut |
Weak | mexa | m·x·j | 4. e·a | he walked |
Hollow | qal | q·w·l | 1. a·a | he said |
Hollow | sab | s·j·b | 1. a·a | he found |
Doubled | habb | h·b·b | 1. a·a | he loved |
Tenses:[edit]
- Perfect: Action in the past
- seraq "he robbed"
- Imperfect: Action in the present/future
- jisraq "he steals" or "he will steal"
- Imperative: Order/command
- israq (sg), isirqu (pl) "steal!"
- Present participle: Only from intransitive verbs, and some verbs of motion. Has both verbal/adjectival function. Has m/f/pl
- nieżel (m.sg) "descending"
- nieżla (f.sg) "descending"
- neżlin (mf.pl) "descending"
- Past participle: Has both verbal/adjectival function. Has m/f/pl
- misruq (m.sg) "stolen"
- misruqa (f.sg) "stolen"
- misruqin (mf.pl) "stolen"
- Verbal noun
- serq "robbing", "theft"
Vowel patterns:[edit]
- KaTaB
- KaTeB
- KeTeB
- KiTeB
- KoToB
Pronominal Suffixes[edit]
Pronouns, prepositions etc. can be dropped in favor of complex suffixes added to the verb;
For example,
iktbilha = ikteb + il + ha
write(imp,p2,sg) + to + her
See The Verb with Pronominal Suffixes for through documentation.
Resources on verbs[edit]
See also: Maltese
Numerals[edit]
We have all Hebrew cardinal numbers and number construction defined in he.dix (still pending tests) (relevant definitions).
For Maltese, we have the cardinal numbers defined ([1]) along with some basic construction rules, but complex numbers is not ready yet (smart paradigms required).
Maltese numeral construction examples:
- 1..9, 11..19 - named numbers.
- 21..99 - "one and thirty" for 31 (wieħed u tletin).
- 100 and up - "three hundred and three and seventy" for 273 (tliet mija u tlieta u sebgħin).
- Same for thousands - "thousand nine hundreds and five and sixty") 1965 (elf disa' mija u ħamsa u sittin)
External resources[edit]
- w3dictionary, a wordnet?
- MyMemory.translated.net, search TMX'es
- Open-Tran.eu, search FOSS translations
- mt Wiktionary
- Old, scanned mt-en dictionary
See also[edit]
- Maltese morphological analyser web demo, using the resources of this language pair