Difference between revisions of "Talk:Welsh to English"
(→"was") |
|||
Line 109: | Line 109: | ||
{{comment| |
{{comment| |
||
::Some serious errors have crept in to those entries. I've sent an amended version to you by email. You're right - roedd=yr+oedd, but in the amended version I've sent, I've put (eg) |
::Some serious errors have crept in to those entries. I've sent an amended version to you by email. You're right - roedd=yr+oedd, but in the amended version I've sent, I've put (eg) 'roedd' and 'oedd' as alternate forms, because 'Roedd' is the spoken form, and even in written Welsh you hardly ever see 'Yr oedd' nowadays. |
||
}} |
}} |
||
Revision as of 14:11, 26 June 2008
English to Welsh
Macros
- This will contain chunks of rules that we need to split out to make them more maintainable
Patterns
Determiner Adjective Noun
When the determiner is indefinite, output noun + adjective When the determiner is definite, output determiner + noun + adjective.
- Tests
(1) A red cat
- coch cath
(2) The red cat
- Y coch cath
Notes for areas to be covered
A sort of scratchpad / todo list, based on things that come up when putting phrases into the testing webform.
Conjunctive genitive
- gwallt yr eneth - *hair the girl - the hair of the girl - the girl's hair
- llaw y bachgen - *hand the boy - the hand of the boy - the boy's hand
Note that the noun phrase in English is definite - contrast "merch y meddyg" (the doctor's daughter) and "merch meddyg" (a doctor's daughter).
For an English phrase of the type "def + noun1 + of + def + noun2" or of the type "def + noun2 + 's + noun1" convert in Welsh to "noun1 + def + noun2".
- Here can noun1 be a simple noun, or can it be a noun phrase? For example "the red cat of the young boy" - Francis Tyers
- e.g.
- For the pattern det.def + noun1 + of + det.def + noun2:
- Output noun1 + det.def + noun2
- For the pattern det.def + noun1 + of + det.def + noun2:
- Yes, as long as you like, eg,
- cath goch bachgen bach merch ifanc bert rheolwr y banc mawr du
- the red cat of the little boy of the pretty young daughter of the manager of the big black bank
- It's only the last NP of the sequence that gets the def.det. Donnek
- Ok, so this requires a three level rule.
- t1x -> t2x SN_(the cat red) of_(of) SN_(the boy little) of_(of) SN_(the daughter young pretty) of_(of) SN_(the manager) of_(of) SN_(the bank big black)
- t2x -> t3x SN_(the cat red) SN_(the boy little) SN_(the daughter young pretty) SN_(the manager) SN_(the bank big black)
- t3x -> gen (cat red boy little daughter young pretty manager the bank big black)
- What I'll do for now is get the chunks working ('SN' -- noun phrase, and 'of'), for values of 'noun', 'det noun', 'det adj noun', 'det adj adj noun', 'det adj adj adj noun', etc. Then look at taking care of more frequent cases (e.g. the first example). Francis Tyers
For a Welsh phrase of the type "!det + noun1 + def + noun2" convert in English to "def + noun1 + of + def + noun2" or to "def + noun2 + 's + noun1".
The second noun is probably historically a genitive, but it has lost all case markers. The equivalent in Irish would be:
- ceann an chapaill - *head the of-horse (gen) - the head of the horse - the horse's head
- ceann capaill - *head of-horse (gen) - the head of a horse - a horse's head
"is"
- the boy is in the garden -> *y bachgen bae yn yr ardd
transfer is getting the 3p sing form OK (mae), but proc is unmutating it (mae -> bae).
The verb needs to be moved to the front of the sentence as well, of course.
Is 'mae' a mutated form of 'bae', or is it the actual verb form? - Francis Tyers
- mae'r bachgen yn yr ardd -> *arethe boy in the garden
proc is missing a space somewhere, and the 3p sing info gets lost between pretransfer and transfer.
Word order again.
- I've copied in a rule from Spanish to do this, the resulting output is: "the boy is in the garden". I have two very basic rules now in interchunk to do: SV SN → SN SV and SV SN1 SN2 → SN1 SV SN2, these will probably break. - Francis Tyers
"was"
"roedd" ([he/she/it] was) is unknown, but I seem to remember adding entries for "to be" to the dixes in the mists of time. Was I dreaming? (roedd <- yr + oedd)
- There are entries for 'bod', but 'roedd' doesn't get processed as all of the 'bod' entries start with 'b' (see this link). I will need to fix this in the analyser. If I understand you correctly, 'roedd' is a contraction of 'yr' (determiner ...) + 'oedd' (verb 'bod', past tense ...)? Francis Tyers
{{{1}}}
- the boy was in the garden -> *y bachgen bu yn yr ardd - bu'r bachgen yn yr ardd
Almost correct, except for word-order, and the fact that the preterite is being used instead of the imperfect ("roedd y bachgen yn yr ardd"). The preterite needs to be marked as only being used in written Welsh, and to have a lower likelihood than the imperfect. This is too rough a rule, but would do for the time being.
Marking and word-order
The above brings up a useful point about this. If the standard VSO sequence is changed to SVO (ie unchanged from the English standard), this is a marked pattern, conveying a relative clause. In written Welsh, the verb will be preceded by "a" + soft mutation, but in spoken Welsh the "a" usually disappears.
- y bachgen [a] fu yn yr ardd ddydd Llun (the boy who was in the garden on Monday)
- yr eneth [a] welodd y ci (the girl who saw the dog)
contrast
- gwelodd yr eneth y ci (the girl saw the dog)
Hmmm. Relative clauses are going to be difficult.
For Welsh pattern "noun + a + soft-mutated_verb" output "noun + who/which + verb".
"i" as preposition
Welsh "i" (to) is getting translated as "[f]i" (I, me).
if Welsh "i" occurs immediately after a verb marked as 1 p sing output pronoun 1p sing otherwise output preposition "to"
- This is a good rule for the tagger. - Francis Tyers 12:19, 26 June 2008 (UTC)