Difference between revisions of "User:Francis Tyers/Sandbox"
Jump to navigation
Jump to search
Line 8: | Line 8: | ||
* Category -- n.f etc. |
* Category -- n.f etc. |
||
* Syntax -- @SUBJ etc. |
* Syntax -- @SUBJ etc. |
||
===Ideas=== |
|||
; Inferring rules from collocations |
|||
* The bilingual dictionary has several translations for each ambiguous word. |
|||
* Rules are created to select between them based on context. |
|||
* For each word in the bilingual dictionary, collocations (n-grams) are extracted from a source language corpus. |
|||
** reisa þetta '''hús''' og fullgjöra |
|||
** reisa þetta '''hús''' og fullgjöra |
|||
** niður þetta '''hús''' Guðs í |
|||
** gjört fyrir '''hús''' Guðs himnanna |
|||
** inn í '''hús''' Semaja Delajasonar |
|||
* For each ambiguous word, these collocations are run with each of the entries in the bilingual dictionary through the translator. |
|||
* Translations are scored on a target language corpus. |
|||
* Where the difference in score between one translation and another reaches a threshold, a rule is created in the form of: |
|||
** <code>MAP (sense1) ("hús") IF (1 ("Guðs"));</code> |
Revision as of 11:34, 7 October 2009
Lexical selection
Information
- Surface form -- tud etc.
- Lemma -- den etc.
- Category -- n.f etc.
- Syntax -- @SUBJ etc.
Ideas
- Inferring rules from collocations
- The bilingual dictionary has several translations for each ambiguous word.
- Rules are created to select between them based on context.
- For each word in the bilingual dictionary, collocations (n-grams) are extracted from a source language corpus.
- reisa þetta hús og fullgjöra
- reisa þetta hús og fullgjöra
- niður þetta hús Guðs í
- gjört fyrir hús Guðs himnanna
- inn í hús Semaja Delajasonar
- For each ambiguous word, these collocations are run with each of the entries in the bilingual dictionary through the translator.
- Translations are scored on a target language corpus.
- Where the difference in score between one translation and another reaches a threshold, a rule is created in the form of:
MAP (sense1) ("hús") IF (1 ("Guðs"));