Maltese and Hebrew
From Apertium
|
Maltese and Hebrew
check out with:
svn co https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mt-he
[edit] Todo list
-
Make program to generate a full form list for a given Maltese verb stem -
Add closed categories to the Maltese analyser (prepositions, conjunctions,pronouns, determiners, numerals) -
Add closed categories to the bidix -
Add closed categories to the Hebrew dictionary (prepositions, pronouns, determiners, numerals, conjunctions) -
Generate Hebrew verb entries from hspell output in Apertium format -
Add existing verbs to the bidix -
Look at how attached/clitic pronouns can be treated (Spanish,Catalan,Italian have similar requirements) -
Add high frequency verbs to the full form list / Maltese analyser -
Add high frequency nouns, adjectives, adverbs to Maltese analyser - Align Maltese and Hebrew bibles
-
Add attached articles to all noun entries -- not 100% sure how this works, see for ex.<e lm="הולדת"><par n="ה"/><i>הולדת</i><par n="עת__n_f"/></e> -
Wiktionary script: 1) Fix URL encoding 2) Output English gloss as a comment -
Change mt_verbs.py to generate negation circumfixes for all verbs: ma(--)x or m'(--)x generates <neg> -
mt_verbs.py: Ability to set alternative forms with a direction restriction, e.g. sp['past.p1.sg'] = ('inkun', 'LR') - Possessive suffixes on Maltese nouns. p.100 "Teach yourself"
- This has been started, see:
Completed paradigms above this line - Find out if all nouns can take these (in theory), or only certain classes of nouns.
- This has been started, see:
- Write a script that scans the corpus for adjective/noun pairs, looking for gender agreement errors.
- Many verbs with the j- prefix can also take an i- prefix or a ø- null prefix, this should be included into each of the class files. (E.g. strong_double_mid_radical.py)
- Rule in verb prefixes for i- and j-
- Past participles have gender/number -- like adjectives
- Fix global generation of negative forms: bela'x should be belagħx (depends on root), etc.
- 619 noun forms with
<GD>, 81 with the tag<ND>-- find the gender and number - 38 adjective forms with
<GD>-- find the gender. - Hebrew overgeneration of nouns - caused by my wrong handling of plural noun forms. Figure out what the source of this is - is it hspell? (contact hspell people). Is it me? (regenerate from hspell and figure out whether to fix it)
[edit] Midterm sprint
| Day | Date | Target | Achieved | Cov. WP | Cov. KPS | Notes |
|---|---|---|---|---|---|---|
| 1 | 26th June | 250 | 154 | 53.95% | First batch (top 120) of adjectives from Unhammer's adj_suspected.txt | |
| 2 | 27th June | 250 | 411 | 55.19% | 52.15% | Batches 2,3,4 from same list (next 440 items) |
| 3 | 28th June | 250 | 210 | 55.38% | Rest of adj_suspected-ku-ka-ci.txt, adj_suspected.txt | |
| 4 | 29th June | 250 | 218 | 60.59% | Nouns from Wiktionary script, some high frequency words (det, prep) | |
| 5 | 30th June | 250 | 248 | 63.20% | 58.82% | Some nouns, adjectives, adverbs |
| 6 | 1st July | 250 | 64.38% | 60.07% | ||
| 7 | 2nd July | 250 | 67.65% | 66.23% | ||
| 8 | 3rd July | 250 | 70.88% | 70.02% | ||
| 9 | 4th July | 250 | — | — | ||
| 10 | 5th July | 250 | 72.13% | 71.34% | ||
| 11 | 6th July | 250 | 72.62% | 72.71% | ||
| 12 | 7th July | 250 | 73.85% | 74.04% | ||
| 13 | 8th July | 250 | 74.32% | 74.50% | ||
| 14 | 9th July | 250 | 75.05% | 75.34% | ||
| 15 | 10th July | 250 | 76.76% | 77.69% | ||
| 16 | 11th July | 250 | — | — | ||
| 17 | 12th July | 250 | 77.50% | 78.40% | ||
| 18 | 13th July | 250 | 78.16% | 79.12% | ||
| 19 | 14th July | 250 | ||||
| 20 | 15th July | 250 |
[edit] Maltese verbs
No infinitive. Stem is third person singular, masculine perfect tense.
Second verb infinitive does not exist, instead both verbs are conjugated. "I want to eat" = "I want I eat"
A verbal stem can consist of:
- Three consonants (radicals) with the medial radical between one of six vowel combinations. (Triliteral)
- kiteb
- Four consonants, some having two repeated biradical bases. (Quadriliteral)
- Two consonants, or a consonant and a semivowel
In verbs with 'għ' or a + 'j' as the third radical, only have the first two radicals in the stem word which ends in 'a' (open syllable).
- Verbs that have three non-semivocalic consonants are called sound or strong verbs.
- Verbs that have three radicals, with the last radical being 'għ' or 'j' are called defective or weak verbs.
- Triliteral verbs with long 'a' or 'ie' between 1st and 2nd radicals are called hollow verbs.
- Triliteral verbs with where the second and third radicals are the same are called doubled or geminated verbs.
Examples:
| Type | Example | Cons | Vowel config | Translation |
|---|---|---|---|---|
| Sound (Tri) | ħareġ | ħ·r·ġ | 2. a·e | he went out |
| Sound (Quad) | ħarbex | ħ·rb·x | 2. a·e | he scribbled |
| Defective | qata' | q·t·għ | 1. a·a | he cut |
| Weak | mexa | m·x·j | 4. e·a | he walked |
| Hollow | qal | q·w·l | 1. a·a | he said |
| Hollow | sab | s·j·b | 1. a·a | he found |
| Doubled | habb | h·b·b | 1. a·a | he loved |
[edit] Tenses:
- Perfect: Action in the past
- seraq "he robbed"
- Imperfect: Action in the present/future
- jisraq "he steals" or "he will steal"
- Imperative: Order/command
- israq (sg), isirqu (pl) "steal!"
- Present participle: Only from intransitive verbs, and some verbs of motion. Has both verbal/adjectival function. Has m/f/pl
- nieżel (m.sg) "descending"
- nieżla (f.sg) "descending"
- neżlin (mf.pl) "descending"
- Past participle: Has both verbal/adjectival function. Has m/f/pl
- misruq (m.sg) "stolen"
- misruqa (f.sg) "stolen"
- misruqin (mf.pl) "stolen"
- Verbal noun
- serq "robbing", "theft"
[edit] Vowel patterns:
- KaTaB
- KaTeB
- KeTeB
- KiTeB
- KoToB
[edit] Pronominal Suffixes
Pronouns, prepositions etc. can be dropped in favor of complex suffixes added to the verb;
For example,
iktbilha = ikteb + il + ha
write(imp,p2,sg) + to + her
See The Verb with Pronominal Suffixes for through documentation.
[edit] Resources on verbs
See also: Maltese
[edit] Numerals
We have all Hebrew cardinal numbers and number construction defined in he.dix (still pending tests) (relevant definitions).
For Maltese, we have the cardinal numbers defined ([1]) along with some basic construction rules, but complex numbers is not ready yet (smart paradigms required).
Maltese numeral construction examples:
- 1..9, 11..19 - named numbers.
- 21..99 - "one and thirty" for 31 (wieħed u tletin).
- 100 and up - "three hundred and three and seventy" for 273 (tliet mija u tlieta u sebgħin).
- Same for thousands - "thousand nine hundreds and five and sixty") 1965 (elf disa' mija u ħamsa u sittin)
[edit] External resources
- w3dictionary, a wordnet?
- MyMemory.translated.net, search TMX'es
- Open-Tran.eu, search FOSS translations
- mt Wiktionary
- Old, scanned mt-en dictionary
[edit] See also
- Maltese morphological analyser web demo, using the resources of this language pair

