Difference between revisions of "User:Capsot/GSOC 2018 Data"

From Apertium
Jump to navigation Jump to search
(Created page with "{|class="wikitable" | ||'''Disambiguation Rules''' || '''Lexical Selection Rules''' || '''Transfer Rules''' || '''Bidix''' || '''Coverage''' || '''WER''' |- |fra-oci|| 678 || ...")
 
m
Line 1: Line 1:
== Statistics ==
{|class="wikitable"
{|class="wikitable"
| ||'''Disambiguation Rules''' || '''Lexical Selection Rules''' || '''Transfer Rules''' || '''Bidix''' || '''Coverage''' || '''WER'''
| || '''Bidix''' || '''Coverage''' || '''WER'''
|-
|-
|fra-oci|| 678 || 855 (85 without counting the selection rules for anthroponyms) || t1x: 130, t2x: 14, t3x: 1, t4x: 5 (for punctuation) || 41,000 (27,000 without family names) || 92.3% || 10%
|fra-oci || 41,000 (27,000 without family names) || 92.3% || 10%
|-
|-
|oci-fra|| 87 || 928 (151 without counting the selection rules for anthroponyms) || t1x: 51, t2ax: 21 (mostly for the insertion or not of the subject pronoun and agreement between subject and attribute), t2bx: 4 (inclusion or not of adverb "ne" in negation), t2cx: 1 (for the partitive article after verb), t3x: 2, t4x: 5 (for punctuation) || 41,000 (27,000 without family names || 92.9% || Not calculated (due to non-functional morphological disambiguator)
|oci-fra || 41,000 (27,000 without family names || 92.9% || Not calculated (due to non-functional morphological disambiguator)
|}
|}

== Rules ==

{|class="wikitable"
| ||'''Disambiguation Rules''' || '''Lexical Selection Rules''' || '''Transfer Rules'''
|-
|fra-oci|| 678 || 855 (85 without counting the selection rules for anthroponyms) || t1x: 130, t2x: 14, t3x: 1, t4x: 5 (for punctuation)
|-
|oci-fra|| 87 || 928 (151 without counting the selection rules for anthroponyms) || t1x: 51, t2ax: 21 (mostly for the insertion or not of the subject pronoun and agreement between subject and attribute), t2bx: 4 (inclusion or not of adverb "ne" in negation), t2cx: 1 (for the partitive article after verb), t3x: 2, t4x: 5 (for punctuation)
|}
PS: (oci-fra) A corpus of 14,000 words has been manually disambiguated for getting a morphological disambuagator, but we couldn't get a working prob file.

Revision as of 22:37, 14 August 2018

Statistics

Bidix Coverage WER
fra-oci 41,000 (27,000 without family names) 92.3% 10%
oci-fra 41,000 (27,000 without family names 92.9% Not calculated (due to non-functional morphological disambiguator)

Rules

Disambiguation Rules Lexical Selection Rules Transfer Rules
fra-oci 678 855 (85 without counting the selection rules for anthroponyms) t1x: 130, t2x: 14, t3x: 1, t4x: 5 (for punctuation)
oci-fra 87 928 (151 without counting the selection rules for anthroponyms) t1x: 51, t2ax: 21 (mostly for the insertion or not of the subject pronoun and agreement between subject and attribute), t2bx: 4 (inclusion or not of adverb "ne" in negation), t2cx: 1 (for the partitive article after verb), t3x: 2, t4x: 5 (for punctuation)

PS: (oci-fra) A corpus of 14,000 words has been manually disambiguated for getting a morphological disambuagator, but we couldn't get a working prob file.