Difference between revisions of "Talk:Welsh to English/Archive 1"

From Apertium
Jump to navigation Jump to search
(New page: ====(1.3.2) "was"==== "roedd" ([he/she/it] was) is unknown, but I seem to remember adding entries for "to be" to the dixes in the mists of time. Was I dreaming? (roedd <- yr + oedd) ...)
 
Line 1: Line 1:



====(1.3.2) "was"====
====(1.3.2) "was"====
Line 22: Line 21:


Almost correct, except for word-order, and the fact that the preterite is being used instead of the imperfect ("roedd y bachgen yn yr ardd"). The preterite needs to be marked as only being used in written Welsh, and to have a lower likelihood than the imperfect. This is too rough a rule, but would do for the time being.
Almost correct, except for word-order, and the fact that the preterite is being used instead of the imperfect ("roedd y bachgen yn yr ardd"). The preterite needs to be marked as only being used in written Welsh, and to have a lower likelihood than the imperfect. This is too rough a rule, but would do for the time being.

====(1.3.4) Preferential choice between noun and verbform====

; atebodd hi'r cwestiwn -> *answered shethe #hold an inquiry - she answered the question

proc selects 'cwestiwn' (question) - correct - and 1p pl imperative of 'cwestio' (an infrequent verb for 'hold an inquiry'). The 1p pl present would also have been a possibility, and indeed a more likely one. tagger selects the second of these.

Not sure how widespread this would be, but the tagger should give precedence to the noun choice whenever the verb form is preceded by 'y':

For Welsh pattern "{y,yr,'r} + word_tagged_as_either_noun_or_verb"
output "{y,yr,'r} + noun"

This is not perfect, because "y | yr" can also be an indirect relative clause pronoun before a verb, but it would catch most things until we can resolve the latter point.

; gwelodd y dyn y llyfr -> *the man saw the books - the man saw the book

This is similar, but is tricksy because it is superficially correct apart from the plural. But in fact, tagger is reading "llyfr" as pres 3p sing of "llyfru" (to book). Apart from being infrequent, and therefore much less likely to appear ("bwcio" would be the usual word), Eurfa has "llyfra" as the pres 3p sing, so there may be a paradigm problem too. The above rule would throw out the verb in the meantime.


{{comment|
::It is currently using the aberth/u__vblex paradigm (see output [http://www.nopaste.com/p/aVI2yKOdqb here]). Is this incorrect? - [[User:Francis Tyers|Francis Tyers]]}}


{{comment|
:::The problem is that "aberthu", apart from the 'regular' "abertha" also has a written "aberth". So yes, it probably is incorrect. The problem is that a lot of less common verbs are very rarely inflected. It might have been better to use something like "gwenu" or "siomi". In the meantime, perhaps just changing "aberth" to "abertha" in the pres 3p sing will do. - [[User:Donnek|Donnek]]}}

Revision as of 12:58, 16 July 2008

(1.3.2) "was"

"roedd" ([he/she/it] was) is unknown, but I seem to remember adding entries for "to be" to the dixes in the mists of time. Was I dreaming? (roedd <- yr + oedd)

There are entries for 'bod', but 'roedd' doesn't get processed as all of the 'bod' entries start with 'b' (see this link). I will need to fix this in the analyser. If I understand you correctly, 'roedd' is a contraction of 'yr' (determiner ...) + 'oedd' (verb 'bod', past tense ...)? Francis Tyers


Some serious errors have crept in to those entries. I've sent an amended version to you by email. You're right - roedd -> yr + oedd, but in the amended version I've sent, I've put (e.g.) "roedd" and "oedd" as alternate forms, because "Roedd" is the spoken form, and even in written Welsh you hardly ever see "Yr oedd" nowadays. Donnek


The "bod" paradigm should now be all ok, there remains however to choose the restrictions (e.g. which forms we will generate for each set of tags). - Francis Tyers
the boy was in the garden -> *y bachgen bu yn yr ardd - bu'r bachgen yn yr ardd

Almost correct, except for word-order, and the fact that the preterite is being used instead of the imperfect ("roedd y bachgen yn yr ardd"). The preterite needs to be marked as only being used in written Welsh, and to have a lower likelihood than the imperfect. This is too rough a rule, but would do for the time being.

(1.3.4) Preferential choice between noun and verbform

atebodd hi'r cwestiwn -> *answered shethe #hold an inquiry - she answered the question

proc selects 'cwestiwn' (question) - correct - and 1p pl imperative of 'cwestio' (an infrequent verb for 'hold an inquiry'). The 1p pl present would also have been a possibility, and indeed a more likely one. tagger selects the second of these.

Not sure how widespread this would be, but the tagger should give precedence to the noun choice whenever the verb form is preceded by 'y':

For Welsh pattern "{y,yr,'r} + word_tagged_as_either_noun_or_verb"
output "{y,yr,'r} + noun"

This is not perfect, because "y | yr" can also be an indirect relative clause pronoun before a verb, but it would catch most things until we can resolve the latter point.

gwelodd y dyn y llyfr -> *the man saw the books - the man saw the book

This is similar, but is tricksy because it is superficially correct apart from the plural. But in fact, tagger is reading "llyfr" as pres 3p sing of "llyfru" (to book). Apart from being infrequent, and therefore much less likely to appear ("bwcio" would be the usual word), Eurfa has "llyfra" as the pres 3p sing, so there may be a paradigm problem too. The above rule would throw out the verb in the meantime.


It is currently using the aberth/u__vblex paradigm (see output here). Is this incorrect? - Francis Tyers


The problem is that "aberthu", apart from the 'regular' "abertha" also has a written "aberth". So yes, it probably is incorrect. The problem is that a lot of less common verbs are very rarely inflected. It might have been better to use something like "gwenu" or "siomi". In the meantime, perhaps just changing "aberth" to "abertha" in the pres 3p sing will do. - Donnek