Difference between revisions of "Icelandic and English"
Jump to navigation
Jump to search
(→Notes) |
|||
(25 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
==Pending tasks== |
==Pending tasks== |
||
* Try and convert some IceTagger constraint rules to work in [[constraint grammar]] |
|||
* Tag a corpus with IceTagger and train the <code>apertium-tagger</code> |
|||
* Post-edit automatically-generated bilingual dictionaries |
|||
* Use IceParser to parse a corpus and extract the most frequent patterns in terms of chunks/phrases (lists of coarse POS tags) and phrase patterns (in terms of chunks/phrases). |
|||
* Merge analysed corpus (IceMorphy full-form list) with Apertium dictionary — will require matching partial information to paradigms... perhaps use [[extract]] ? |
|||
⚫ | |||
== |
==Notes== |
||
* ind(is) → def(en): almenningur, alþjóð, alþýða, heimur, stjórnarandstaða, bæjarstjórn, Ermarsund, nefnifall, |
|||
* Mediawiki l10n, KDE4, OpenSubtitles, etc. — from OPUS (~60k sentences) |
|||
⚫ | |||
===Bilingual dictionaries=== |
===Bilingual dictionaries=== |
||
Line 21: | Line 19: | ||
** And [http://www.northvegr.org/vigfusson/index.php here] |
** And [http://www.northvegr.org/vigfusson/index.php here] |
||
* Wordbank [http://www.ismal.hi.is/ob/birta/index.cgi at ismal.hi.is] (licence unknown) |
* Wordbank [http://www.ismal.hi.is/ob/birta/index.cgi at ismal.hi.is] (licence unknown) |
||
** was moved to [http://herdubreid.rhi.hi.is:1026/wordbank/search here] |
|||
==Example phrase== |
==Example phrase== |
||
Line 39: | Line 38: | ||
[VPb er sfg3en VPb] |
[VPb er sfg3en VPb] |
||
{*COMP< [VPp borinn sþgken VPp] *COMP<} |
{*COMP< [VPp borinn sþgken VPp] *COMP<} |
||
{*COMP< [AP frjáls lkensf AP] *COMP<} |
{*COMP< [APs [AP frjáls lkensf AP] [CP og c CP] [AP jafn lkensf AP] APs] *COMP<} |
||
[CP og c CP] |
|||
[AdvP jafn aa AdvP] |
|||
[NP öðrum fokfþ NP] |
[NP öðrum fokfþ NP] |
||
[SCP að c SCP] |
[SCP að c SCP] |
||
Line 60: | Line 57: | ||
^prn_nom<SN><@SUBJ→>{^Hver<prn><ind><m><sg><nom>$ ^maður<n><m><sg><nom><ind>$}$ |
^prn_nom<SN><@SUBJ→>{^Hver<prn><ind><m><sg><nom>$ ^maður<n><m><sg><nom><ind>$}$ |
||
^verb<SV>{^vera<vbser><pri><p3><sg>$ ^bera<vblex><pp><m><sg><nom>$}$ |
^verb<SV>{^vera<vbser><pri><p3><sg>$ ^bera<vblex><pp><m><sg><nom>$}$ |
||
^adj_cc_adj<SA>{^frjáls<adj><sta><pst><m><sg><nom>$ ^og<cnjcoo>$ ^jafn<adj><sta><pst><m><sg><nom>$}$ |
|||
^nom<SN>{^annar<prn><ind><m><pl><dat>$}$ |
|||
^að<Prep>{^að<pr>$}$ |
|||
^nom_cc_nom{^virðing<n><f><sg><dat><def>$ ^og<cnjcoo>$ ^réttindi<n><nt><pl><dat><ind>$}$ |
|||
</pre> |
</pre> |
||
==See also== |
==See also== |
||
* [[/ |
* [[/Pending tests|Pending tests]] — Examples for testing new rules |
||
* [[/Regression tests|Regression tests]] — Examples of working phrase translations. |
|||
==External links== |
|||
* [http://iceblark.wordpress.com/translation/ is-en MT entry on the IceBLARK blog] |
|||
[[Category:Icelandic and English]] |
[[Category:Icelandic and English]] |
Latest revision as of 12:31, 17 June 2010
Pending tasks[edit]
- Try and convert some IceTagger constraint rules to work in constraint grammar
Notes[edit]
- ind(is) → def(en): almenningur, alþjóð, alþýða, heimur, stjórnarandstaða, bæjarstjórn, Ermarsund, nefnifall,
Resources[edit]
Bilingual dictionaries[edit]
- Wikipedia interwiki (~1,100 entries)
- Freelang (~1,000 entries)
- Wiktionary (en) (~3,200 entries)
- An Icelandic-English Dictionary (Old Icelandic, 1876 — Public Domain)
- And here
- Wordbank at ismal.hi.is (licence unknown)
- was moved to here
Example phrase[edit]
- Hver maður er borinn frjáls og jafn öðrum að virðingu og réttindum.
IceFormat[edit]
Hver foken maður nken er sfg3en borinn sþgken frjáls lkensf og c jafn aa öðrum fokfþ að c virðingu nveþ og c réttindum nhfþ . .
{*SUBJ> [NP Hver foken maður nken NP] *SUBJ>} [VPb er sfg3en VPb] {*COMP< [VPp borinn sþgken VPp] *COMP<} {*COMP< [APs [AP frjáls lkensf AP] [CP og c CP] [AP jafn lkensf AP] APs] *COMP<} [NP öðrum fokfþ NP] [SCP að c SCP] [NPs [NP virðingu nveþ NP] [CP og c CP] [NP réttindum nhfþ NP] NPs]
Apertium[edit]
^Hver<prn><ind><m><sg><nom>$ ^maður<n><m><sg><nom><ind>$ ^vera<vbser><pri><p3><sg>$ ^bera<vblex><pp><m><sg><nom>$ ^frjáls<adj><sta><pst><m><sg><nom>$ ^og<cnjcoo>$ ^jafn<adj><sta><pst><m><sg><nom>$ ^annar<prn><ind><m><pl><dat>$ ^að<pr>$ ^virðing<n><f><sg><dat><def>$ ^og<cnjcoo>$ ^réttindi<n><nt><pl><dat><ind>$ ^.<sent>$
^prn_nom<SN><@SUBJ→>{^Hver<prn><ind><m><sg><nom>$ ^maður<n><m><sg><nom><ind>$}$ ^verb<SV>{^vera<vbser><pri><p3><sg>$ ^bera<vblex><pp><m><sg><nom>$}$ ^adj_cc_adj<SA>{^frjáls<adj><sta><pst><m><sg><nom>$ ^og<cnjcoo>$ ^jafn<adj><sta><pst><m><sg><nom>$}$ ^nom<SN>{^annar<prn><ind><m><pl><dat>$}$ ^að<Prep>{^að<pr>$}$ ^nom_cc_nom{^virðing<n><f><sg><dat><def>$ ^og<cnjcoo>$ ^réttindi<n><nt><pl><dat><ind>$}$
See also[edit]
- Pending tests — Examples for testing new rules
- Regression tests — Examples of working phrase translations.