Basque and Spanish

From Apertium
Revision as of 07:44, 19 June 2007 by Francis Tyers (talk | contribs)
Jump to navigation Jump to search

The idea

Mireia Ginestí is recycling Matxin data to build an Apertium-based system that would allow Spanish speakers to read Basque newspapers.

Some of the morphological choices in Matxin will be revised.

This document is to keep track of decisions and to raise questions


For instance, "declination" will be treated as postpositions:

gizonentzat : gizon.n + + tzat.prep

In principle, the absolutive will not be marked:

gizonak : gizon.n +

Determiners and postpositions will be given mnemonic lemmas, one per case.

gizonei : gizon.n + + i.prep
Mirenekin : Miren.NP + kin.prep
katuarentzat : katu.n + + tzat.prep

Postpositions which can modify a noun phrase will be marked explicitly as ko

etxeetako: etxe.n + + ko.prep.ko
Mikelekin : Mikel.NP + kin.prep
Mikelekiko : Mikel.NP + kin.prep.ko


A problem appears with "possessives" like 'nire', 'gure', 'zuen', 'haien', 'bere'. Should they be treated as preadjectives ('izenlagun') or as genitive constructs:

nire: + ren.gen.ko
haien : + ren.gen.ko