Difference between revisions of "Scottish Gaelic and Irish"
Jump to navigation
Jump to search
m (→Todo: sign up for some things) |
|||
Line 4: | Line 4: | ||
* Add ability to analyse initial mutations to the monolingual dictionary. |
* Add ability to analyse initial mutations to the monolingual dictionary. |
||
-- I have most of the work done for this -- [[User:Jimregan|Jimregan]] |
|||
* Add all closed categories to the monolingual dictionaries. |
* Add all closed categories to the monolingual dictionaries. |
||
-- |
|||
* Improve the tagger -- write restrictions/constraints, and then retrain. |
* Improve the tagger -- write restrictions/constraints, and then retrain. |
||
* Perform an intersection on the monolingual dictionaries. |
* Perform an intersection on the monolingual dictionaries. |
||
** We only want stuff in the Irish analyser that we can translate into Scottish Gaelic -- so, in order for a word to be included, it should be in both the Irish monolingual, bilingual and the translation in the Scottish Gaelic monolingual. With the words for which we don't have translations we can just comment them out for now. |
** We only want stuff in the Irish analyser that we can translate into Scottish Gaelic -- so, in order for a word to be included, it should be in both the Irish monolingual, bilingual and the translation in the Scottish Gaelic monolingual. With the words for which we don't have translations we can just comment them out for now. |
||
-- Count me out on this one; I will suggest using <e i="yes"> etc. instead of xml comments -- [[User:Jimregan|Jimregan]] |
|||
* Do some fixing of the bilingual dictionary |
* Do some fixing of the bilingual dictionary |
||
** There are some entries with unknown gender on the Scottish Gaelic side. |
** There are some entries with unknown gender on the Scottish Gaelic side. |
||
** Some restrictions probably need adding. |
** Some restrictions probably need adding. |
||
** Some conjunctions are marked "cnj" and not subdivided for "cnjcoo", "cnjsub" etc. |
** Some conjunctions are marked "cnj" and not subdivided for "cnjcoo", "cnjsub" etc. |
||
-- I'll take this one too -- [[User:Jimregan|Jimregan]] |
|||
* Write rules to do initial mutations for generation. |
* Write rules to do initial mutations for generation. |
||
* Write some transfer rules. |
* Write some transfer rules. |
||
** For example to do tenses, number agreement, etc. |
** For example to do tenses, number agreement, etc. |
||
-- We can probably take most of this stuff from another language pair and add the consonant etc. stuff later; for the most part, adjective chunks etc. should be the same as those in at least one other pair (I'll scout around for which) -- [[User:Jimregan|Jimregan]] |
|||
==Tagger== |
==Tagger== |
Revision as of 13:40, 25 June 2008
Contents |
Todo
- Add ability to analyse initial mutations to the monolingual dictionary.
-- I have most of the work done for this -- Jimregan
- Add all closed categories to the monolingual dictionaries.
--
- Improve the tagger -- write restrictions/constraints, and then retrain.
- Perform an intersection on the monolingual dictionaries.
- We only want stuff in the Irish analyser that we can translate into Scottish Gaelic -- so, in order for a word to be included, it should be in both the Irish monolingual, bilingual and the translation in the Scottish Gaelic monolingual. With the words for which we don't have translations we can just comment them out for now.
-- Count me out on this one; I will suggest using <e i="yes"> etc. instead of xml comments -- Jimregan
- Do some fixing of the bilingual dictionary
- There are some entries with unknown gender on the Scottish Gaelic side.
- Some restrictions probably need adding.
- Some conjunctions are marked "cnj" and not subdivided for "cnjcoo", "cnjsub" etc.
-- I'll take this one too -- Jimregan
- Write rules to do initial mutations for generation.
- Write some transfer rules.
- For example to do tenses, number agreement, etc.
-- We can probably take most of this stuff from another language pair and add the consonant etc. stuff later; for the most part, adjective chunks etc. should be the same as those in at least one other pair (I'll scout around for which) -- Jimregan