Difference between revisions of "Welsh to English"

From Apertium
Jump to navigation Jump to search
Line 30: Line 30:
 
== Tagger ==
 
== Tagger ==
   
  +
===="i" as preposition====
;Forbid
 
 
* Preposition followed by verb in the infinitive
 
 
::Ambiguity: <code>^i/i<pr>/prpers<prn><subj><p1><mf><sg>$ ^foderneiddio/moderneiddio<vblex><inf>/moderneiddio<vblex><prs><p3><sg>$</code>
 
::Ambiguity: <code>^i/i<pr>/prpers<prn><subj><p1><mf><sg>$ ^foderneiddio/moderneiddio<vblex><inf>/moderneiddio<vblex><prs><p3><sg>$</code>
   
  +
;Enforce
 
  +
Welsh "i" (to) is getting translated as "[f]i" (I, me).
  +
  +
if Welsh "i" occurs immediately after a verb marked as 1p sing
  +
output pronoun 1p sing
  +
otherwise output preposition "to"
  +
  +
===="o'n" - disambiguate "he" and "from"====
  +
  +
; mae fo'n mynd -> he isgoing
  +
Fine (apart from the missing space).
  +
  +
Contrast:
  +
; mae o'n mynd -> *is ofgoing - he is going
  +
  +
The elided form "o" is more common here than "fo". Following the 1.3.4 pattern above:
  +
  +
if Welsh "o" occurs immediately after a verb marked as 3p sing
  +
output pronoun 3p sing
  +
otherwise output preposition "of/from"
  +
  +
This is probably better than the earlier version I had here:
  +
  +
For Welsh pattern "verb + o"
  +
output "verb + 3p sing pronoun"
   
 
== Transfer ==
 
== Transfer ==

Revision as of 10:02, 28 June 2008


Todo

  • Fix multiword verbs in bilingual dictionary -- and add ones non-existent in English dictionary to that dictionary
  • Remove items which are in English dictionary but not Welsh/Bilingual
  • Fix verb conjugation in the Welsh analyser
  • Add restrictions in the bidix

Roadmap

apertium-cy-en 0.1

  • 8,000 of the highest frequency words in each dictionary.
  • Rules dealing with basic verb tenses (past, present, future)
  • Basic word re-ordering for simple phrases.
Aims and uses
  • For a non-native speaker to be able to discern the topic of a general news item.
  • To be able to identify who said what to who.
  • To be able to distinguish is a particular item is interesting enough to be translated properly.
  • Sentences of up to 5 words should be translated reasonably well in both directions.

apertium-cy-en 0.5

apertium-cy-en 1.0

Tagger

"i" as preposition

Ambiguity: ^i/i<pr>/prpers<prn><subj><p1><mf><sg>$ ^foderneiddio/moderneiddio<vblex><inf>/moderneiddio<vblex><prs><p3><sg>$


Welsh "i" (to) is getting translated as "[f]i" (I, me).

if Welsh "i" occurs immediately after a verb marked as 1p sing
output pronoun 1p sing
otherwise output preposition "to"

"o'n" - disambiguate "he" and "from"

mae fo'n mynd -> he isgoing

Fine (apart from the missing space).

Contrast:

mae o'n mynd -> *is ofgoing - he is going

The elided form "o" is more common here than "fo". Following the 1.3.4 pattern above:

if Welsh "o" occurs immediately after a verb marked as 3p sing
output pronoun 3p sing
otherwise output preposition "of/from"

This is probably better than the earlier version I had here:

For Welsh pattern "verb + o"
output "verb + 3p sing pronoun"

Transfer

# Welsh
: Literal
@ Gloss (English)

Welsh to English

Word order (VSO to SVO)

# Genir   pawb     yn rhydd ac  yn gydradd â    'i  gilydd      mewn urddas  a   hawliau.
: Be born everyone    free  and    equal   with      each other in   dignity and rights.

@ Everyone is born free and equal with each other in dignity and rights.

Noun Noun -> Noun of Noun

# Llywodraeth Cynulliad Cymru
: Government  Assembly  Wales ==> Government (of) Assembly (of)  Wales

@ Welsh Assembly Government

Noun Adjective -> Adjective Noun

# bachgen hapus
: boy     happy

@ happy boy

# geneth bert
: girl   pretty

@ pretty girl

Compound prepositions

<donnek> I've also thought of another wrinkle - compound prepositions
<spectie> i will probably need to write a rule
<donnek> eg ar ben (on top of)
<donnek> lit on head
<spectie> we can do a similar thing with those
<spectie> for example:
<donnek> becomes ar fy mhen (on my head, literally) = on top of me
<donnek> ar ei ben, ar ei phen, ar ein pennau
<spectie> are there many of them
<donnek> maybe we don't need to think about them now, but just to flag them for later
<spectie> if there are not many it might be worth making them multiwords
<donnek> how do multiwords work
<spectie> there are a few ways
<spectie> depending on if one of the words inside the multiword inflects or not
<donnek> that would be the case here
<spectie> for example "take care"
<spectie> "i take care of", "you take care of", "he takes care of"
<spectie> but "take care" is treated as one verb
<donnek> ok

Attributive and predicative adjectives

<spectie> its a problem with attributive/predicative
<donnek> it's say something (which is) nice
<spectie> but in english we don't distinguish between the two (at least in terms of morphology)
<spectie> yes
<spectie> in afrikaans they have a -e for attributive (e.g. feodale stelsel -- feudal system) 
<spectie> and "the system is feudal" - "die stelsel is feodaal"
<spectie> donnek, aye
<donnek> in Welsh the second would have yn before the adj
<donnek> so we may not need anything to mark attrib/pred
  • Dywedodd rhywbeth neis wrthi = He said something nice to her
  • Mae'r peth yno yn neis = That thing is nice
Mae yr peth yno yn neis
  • Mae'n gar neis = It is a nice car
Mae yn gar neis
<donnek> at first glance, we may just need a rule for rhyw+thing
<donnek> rhyw = some
<donnek> rhywbeth (something), rhywfaint (somewhat), etc
<donnek> rhywle (somewhere)

Possession

 Mae            cath 'da    Bwflw
 Bod+p1.sg.pres cath  gyda  Bwflw
 Be+p1.sg.pres  cat   with  Beefalo
`Beefalo has a cat'
Apertium notes

We can probably deal with this in interchunk as follows

vbbod NP1 pr_gyda NP2

->

NP2 vbhave NP1

The 'yn' particle

As well as meaning 'in', 'yn' is used to form the present participle of a verb in welsh. For example:

  • dysgu = to learn
  • yn dysgu = learning

The present tense is formed by combining 'yn' with the corresponding form of 'bod' (to be) as follows:

  • Mae Beefalo yn gweithio = Beefalo is working/Beefalo works

Note: when following a vowel, yn is abbreviated to 'n, e.g.

  • Mae Beefalo'n gweithio

Genitive Phrases

To form the indefinite genitive, a simple construct of <object><subject> can be used. For example, "Soldiers of Wales" would be "milwyr Cymru", literally "soldier Wales"

Definite genitives are formed with a similar construction, just with the addition of y between the object and the subject. For example, "Beic y gath" = "The cat's bike" literally "bike the cat" Note: feminine nouns incur a soft mutation after the word "y"