Difference between revisions of "Welsh to English"

From Apertium
Jump to navigation Jump to search
 
(48 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{TOCD}}
{{TOCD}}


<pre>
# Welsh
: Literal
@ Gloss (English)
</pre>


== Transfer ==
==Todo==


* <s>Fix multiword verbs in bilingual dictionary -- and add ones non-existent in English dictionary to that dictionary</s>
=== Welsh to English ===
* Remove items which are in English dictionary but not Welsh/Bilingual
* <s>Fix verb conjugation in the Welsh analyser</s>
* <s>Add restrictions in the bidix</s>
* Fix numbers
* <s>Add adverbs</s>
* <s>More thorough handling of contractions (i'ch, a'u, ...) &mdash; including preblank</s>
* <s>Add pre-verbal particles (basic functionality)</s>
* Add adjective macro to all chunks


==Roadmap==
==== Word order (VSO to SVO) ====
<pre>
# Genir pawb yn rhydd ac yn gydradd â 'i gilydd mewn urddas a hawliau.
: Be born everyone free and equal with each other in dignity and rights.


===apertium-cy-en 0.1===
@ Everyone is born free and equal with each other in dignity and rights.

</pre>
* 8,000 of the highest frequency words in each dictionary.
==== Noun Adjective -> Adjective Noun ====
* Rules dealing with basic verb tenses (past, present, future)
<pre>
* Basic word re-ordering for simple phrases.
# Llywodraeth Cynulliad Cymru

: Government Assembly Welsh
;Aims and uses

* For a non-native speaker to be able to discern the topic of a general news item.
* To be able to identify ''who'' said ''what'' to ''who''.
* To be able to distinguish is a particular item is interesting enough to be translated properly.
* Sentences of up to 5 words should be translated reasonably well from Welsh to English.

;Report

* Coverage:
** Wikipedia (753,741 words): 85.5%
** PNAW (11,684,177 words): 94%
** BBC Newyddion (144,887 words): 91%

===apertium-cy-en 0.2===

* 0.1 performance and coverage for English to Welsh.

===apertium-cy-en 0.5===

* Properly capitalised sentences.
* Get the number for nouns from the appropriate place. e.g. sometimes from the det, sometimes from the noun.

===apertium-cy-en 1.0===

* Handling of gender and number in adjectives


@ Welsh Assembly Government
</pre>


====Compound prepositions====
<pre>
<donnek> I've also thought of another wrinkle - compound prepositions
<spectie> i will probably need to write a rule
<donnek> eg ar ben (above)
<donnek> lit on head
<spectie> we can do a similar thing with those
<spectie> for example:
<donnek> becomes ar fy mhen (on my head, literally) = above me
<donnek> ar ei ben, ar ei phen, ar ein pennau
<spectie> are there many of them
<donnek> maybe we don't need to think about them now, but just to flag them for later
<spectie> if there are not many it might be worth making them multiwords
<donnek> how do multiwords work
<spectie> there are a few ways
<spectie> depending on if one of the words inside the multiword inflects or not
<donnek> that would be the case here
<spectie> for example "take care"
<spectie> "i take care of", "you take care of", "he takes care of"
<spectie> but "take care" is treated as one verb
<donnek> ok
</pre>


[[Category:Discussions]]
[[Category:Discussions]]
[[Category:Welsh to English]]

Latest revision as of 13:24, 10 December 2010


Todo[edit]

  • Fix multiword verbs in bilingual dictionary -- and add ones non-existent in English dictionary to that dictionary
  • Remove items which are in English dictionary but not Welsh/Bilingual
  • Fix verb conjugation in the Welsh analyser
  • Add restrictions in the bidix
  • Fix numbers
  • Add adverbs
  • More thorough handling of contractions (i'ch, a'u, ...) — including preblank
  • Add pre-verbal particles (basic functionality)
  • Add adjective macro to all chunks

Roadmap[edit]

apertium-cy-en 0.1[edit]

  • 8,000 of the highest frequency words in each dictionary.
  • Rules dealing with basic verb tenses (past, present, future)
  • Basic word re-ordering for simple phrases.
Aims and uses
  • For a non-native speaker to be able to discern the topic of a general news item.
  • To be able to identify who said what to who.
  • To be able to distinguish is a particular item is interesting enough to be translated properly.
  • Sentences of up to 5 words should be translated reasonably well from Welsh to English.
Report
  • Coverage:
    • Wikipedia (753,741 words): 85.5%
    • PNAW (11,684,177 words): 94%
    • BBC Newyddion (144,887 words): 91%

apertium-cy-en 0.2[edit]

  • 0.1 performance and coverage for English to Welsh.

apertium-cy-en 0.5[edit]

  • Properly capitalised sentences.
  • Get the number for nouns from the appropriate place. e.g. sometimes from the det, sometimes from the noun.

apertium-cy-en 1.0[edit]

  • Handling of gender and number in adjectives