Difference between revisions of "Welsh to English"
Jump to navigation
Jump to search
(46 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
{{TOCD}} |
||
<pre> |
|||
# Welsh |
|||
: Literal |
|||
@ Gloss (English) |
|||
</pre> |
|||
== |
==Todo== |
||
* <s>Fix multiword verbs in bilingual dictionary -- and add ones non-existent in English dictionary to that dictionary</s> |
|||
⚫ | |||
* Remove items which are in English dictionary but not Welsh/Bilingual |
|||
* <s>Fix verb conjugation in the Welsh analyser</s> |
|||
* <s>Add restrictions in the bidix</s> |
|||
* Fix numbers |
|||
* <s>Add adverbs</s> |
|||
* <s>More thorough handling of contractions (i'ch, a'u, ...) — including preblank</s> |
|||
* <s>Add pre-verbal particles (basic functionality)</s> |
|||
* Add adjective macro to all chunks |
|||
==Roadmap== |
|||
==== Word order (VSO to SVO) ==== |
|||
<pre> |
|||
# Genir pawb yn rhydd ac yn gydradd â 'i gilydd mewn urddas a hawliau. |
|||
: Be born everyone free and equal with each other in dignity and rights. |
|||
===apertium-cy-en 0.1=== |
|||
@ Everyone is born free and equal with each other in dignity and rights. |
|||
</pre> |
|||
* 8,000 of the highest frequency words in each dictionary. |
|||
==== Noun Noun -> Noun of Noun ==== |
|||
* Rules dealing with basic verb tenses (past, present, future) |
|||
<pre> |
|||
* Basic word re-ordering for simple phrases. |
|||
# Llywodraeth Cynulliad Cymru |
|||
: Government Assembly Wales ==> Government (of) Assembly (of) Wales |
|||
;Aims and uses |
|||
* For a non-native speaker to be able to discern the topic of a general news item. |
|||
* To be able to identify ''who'' said ''what'' to ''who''. |
|||
* To be able to distinguish is a particular item is interesting enough to be translated properly. |
|||
* Sentences of up to 5 words should be translated reasonably well from Welsh to English. |
|||
;Report |
|||
* Coverage: |
|||
** Wikipedia (753,741 words): 85.5% |
|||
** PNAW (11,684,177 words): 94% |
|||
** BBC Newyddion (144,887 words): 91% |
|||
===apertium-cy-en 0.2=== |
|||
* 0.1 performance and coverage for English to Welsh. |
|||
===apertium-cy-en 0.5=== |
|||
* Properly capitalised sentences. |
|||
* Get the number for nouns from the appropriate place. e.g. sometimes from the det, sometimes from the noun. |
|||
===apertium-cy-en 1.0=== |
|||
* Handling of gender and number in adjectives |
|||
@ Welsh Assembly Government |
|||
</pre> |
|||
====Compound prepositions==== |
|||
<pre> |
|||
<donnek> I've also thought of another wrinkle - compound prepositions |
|||
<spectie> i will probably need to write a rule |
|||
<donnek> eg ar ben (on top of) |
|||
<donnek> lit on head |
|||
<spectie> we can do a similar thing with those |
|||
<spectie> for example: |
|||
<donnek> becomes ar fy mhen (on my head, literally) = on top of me |
|||
<donnek> ar ei ben, ar ei phen, ar ein pennau |
|||
<spectie> are there many of them |
|||
<donnek> maybe we don't need to think about them now, but just to flag them for later |
|||
<spectie> if there are not many it might be worth making them multiwords |
|||
<donnek> how do multiwords work |
|||
<spectie> there are a few ways |
|||
<spectie> depending on if one of the words inside the multiword inflects or not |
|||
<donnek> that would be the case here |
|||
<spectie> for example "take care" |
|||
<spectie> "i take care of", "you take care of", "he takes care of" |
|||
<spectie> but "take care" is treated as one verb |
|||
<donnek> ok |
|||
</pre> |
|||
[[Category:Discussions]] |
[[Category:Discussions]] |
||
⚫ |
Latest revision as of 13:24, 10 December 2010
Todo[edit]
Fix multiword verbs in bilingual dictionary -- and add ones non-existent in English dictionary to that dictionary- Remove items which are in English dictionary but not Welsh/Bilingual
Fix verb conjugation in the Welsh analyserAdd restrictions in the bidix- Fix numbers
Add adverbsMore thorough handling of contractions (i'ch, a'u, ...) — including preblankAdd pre-verbal particles (basic functionality)- Add adjective macro to all chunks
Roadmap[edit]
apertium-cy-en 0.1[edit]
- 8,000 of the highest frequency words in each dictionary.
- Rules dealing with basic verb tenses (past, present, future)
- Basic word re-ordering for simple phrases.
- Aims and uses
- For a non-native speaker to be able to discern the topic of a general news item.
- To be able to identify who said what to who.
- To be able to distinguish is a particular item is interesting enough to be translated properly.
- Sentences of up to 5 words should be translated reasonably well from Welsh to English.
- Report
- Coverage:
- Wikipedia (753,741 words): 85.5%
- PNAW (11,684,177 words): 94%
- BBC Newyddion (144,887 words): 91%
apertium-cy-en 0.2[edit]
- 0.1 performance and coverage for English to Welsh.
apertium-cy-en 0.5[edit]
- Properly capitalised sentences.
- Get the number for nouns from the appropriate place. e.g. sometimes from the det, sometimes from the noun.
apertium-cy-en 1.0[edit]
- Handling of gender and number in adjectives