Difference between revisions of "User:Gourab337/GSoC2021-Workplan-Control"

From Apertium
Jump to navigation Jump to search
(Added Coverage)
 
(10 intermediate revisions by 2 users not shown)
Line 39: Line 39:
 
|
 
|
 
|
 
|
|apertium-ben:
+
|apertium-ben:<br>
Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn
+
Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn<br>
 
Add/check words: pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn
 
Add/check words: pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn
 
|pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn
 
|pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn
 
|6603
|8370
 
 
|756
6603
 
 
post: 70
 
cnj: 87
 
num: 123
 
det: 68
 
prn: 52
 
|762
 
756
 
 
post: 53
 
cnj: 123
 
num: 2
 
det: 31
 
prn: 66
 
 
|
 
|
|ben-hin: ~0.29775176514306949090
+
|hin-ben: ~33.3%<br>
ben: ~0.19236742844464166852
+
ben-hin: ~20.4%<br>
  +
ben: ~67.9%
 
|
 
|
 
|
 
|
Line 72: Line 59:
 
|
 
|
 
|preparing scripts for adding words from the available free data into the dictionaries
 
|preparing scripts for adding words from the available free data into the dictionaries
  +
|6637
  +
|895
 
|
 
|
 
|hin-ben: ~40.1%<br>
|
 
  +
ben-hin: ~29.7%<br>
|
 
  +
ben: ~67.7%
|
 
 
|
 
|
 
|
 
|
Line 85: Line 74:
 
|
 
|
 
|
 
|
|Key transfer rules hin > ben to avoid #
+
|Key transfer rules hin > ben to avoid #<br>
Eventualy: the same for ben > hin
+
Eventually: the same for ben > hin<br>
 
Manual disambiguation of Hindi texts
 
Manual disambiguation of Hindi texts
  +
|6640
  +
|931
 
|
 
|
  +
|hin-ben: ~39.5%<br>
|
 
  +
ben-hin: ~34.0%<br>
|
 
  +
ben: ~69.9%
|
 
 
|
 
|
 
|
 
|
Line 102: Line 93:
 
|
 
|
 
|Manual disambiguation of Hindi texts
 
|Manual disambiguation of Hindi texts
  +
|6687
  +
|917
 
|
 
|
  +
|hin-ben: ~44.5%<br>
|
 
  +
ben-hin: ~39.3%<br>
|
 
  +
ben: ~70.0%
|
 
 
|
 
|
 
|
 
|
Line 114: Line 107:
 
|
 
|
 
|
 
|
|apertium-ben:
+
|apertium-ben:<br>
ordinals
+
ordinals<br>
Manual adding of most often names (150), adjectives (100), verbs (50)"
+
Manual adding of most often names (150), adjectives (100), verbs (50)
|"ordinals
+
|ordinals<br>
Most often names (150), adjectives (100), verbs (50)
+
Most often names (150), adjectives (100), verbs (50)<br>
 
Word selection rules
 
Word selection rules
  +
|6764
  +
|1136
 
|
 
|
  +
|hin-ben: ~63.2%<br>
|
 
  +
ben-hin: ~43.4%<br>
|
 
  +
ben: ~71.0%
|
 
 
|
 
|
 
|
 
|
Line 133: Line 128:
 
|
 
|
 
|Adding words from available data
 
|Adding words from available data
|Adding words from available data
+
|Adding words from available data<br>
 
Word selection rules
 
Word selection rules
  +
|6984
  +
|1328
 
|
 
|
  +
|hin-ben: ~65.5%<br>
|
 
  +
ben-hin: ~47.6%<br>
|
 
  +
ben: ~71.8%
|
 
 
|
 
|
 
|
 
|
Line 148: Line 145:
 
|hin - ben ~50%
 
|hin - ben ~50%
 
|Adding words from available data
 
|Adding words from available data
|Adding words from available data
+
|Adding words from available data<br>
 
Word selection rules
 
Word selection rules
  +
|7075
  +
|1670
 
|
 
|
  +
|hin-ben: ~67.6%<br>
|
 
  +
ben-hin: ~49.6%<br>
|
 
  +
ben: ~72.0%
|
 
 
|
 
|
 
|
 
|
Line 163: Line 162:
 
|
 
|
 
|Morphological disambiguation rules for Hindi
 
|Morphological disambiguation rules for Hindi
|Transfer rules
+
|Transfer rules<br>
 
Testvoc: closed categories, adv
 
Testvoc: closed categories, adv
 
|
 
|
Line 178: Line 177:
 
|
 
|
 
|Morphological disambiguation rules for Hindi
 
|Morphological disambiguation rules for Hindi
|Transfer rules
+
|Transfer rules<br>
 
Testvoc: adj
 
Testvoc: adj
 
|
 
|
Line 193: Line 192:
 
|
 
|
 
|Morphological disambiguation rules for Hindi
 
|Morphological disambiguation rules for Hindi
|Transfer rules
+
|Transfer rules<br>
 
Testvoc: n
 
Testvoc: n
 
|
 
|
Line 208: Line 207:
 
|hin - ben ~65%
 
|hin - ben ~65%
 
|Morphological disambiguation rules for Hindi
 
|Morphological disambiguation rules for Hindi
|"Transfer rules
+
|Transfer rules<br>
Testvoc: vblex"
+
Testvoc: vblex
|
 
|
 
|
 
|
 
|
 
|
 
|-
 
|11
 
|07/25/2021
 
|10000
 
|~80%
 
|hin - ben ~50%
 
|Adding words from available data
 
|Adding words from available data
 
Word selection rules
 
 
|
 
|
 
|
 
|

Latest revision as of 10:30, 26 July 2021

Workplan
Week Dates Goals Fulfilled
Bidix

(excluding proper names)

Coverage WER Monlingual dictionaries Bilingual dictionary / repository ben monodix

(excl. proper names)

Bidix

(excl. proper names)

Non-WP

coverage (%)

WP

coverage (%)

WER

(%)

Testvoc

(clean %) --- Manual disamb. (words)

1 06/13/2021 500 apertium-ben:

Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn
Add/check words: pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn

pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn 6603 756 hin-ben: ~33.3%

ben-hin: ~20.4%
ben: ~67.9%

2 06/20/2021 500 preparing scripts for adding words from the available free data into the dictionaries 6637 895 hin-ben: ~40.1%

ben-hin: ~29.7%
ben: ~67.7%

3 06/27/2021 500 Key transfer rules hin > ben to avoid #

Eventually: the same for ben > hin
Manual disambiguation of Hindi texts

6640 931 hin-ben: ~39.5%

ben-hin: ~34.0%
ben: ~69.9%

4 07/04/2021 500 Manual disambiguation of Hindi texts 6687 917 hin-ben: ~44.5%

ben-hin: ~39.3%
ben: ~70.0%

5 07/11/2021 800 apertium-ben:

ordinals
Manual adding of most often names (150), adjectives (100), verbs (50)

ordinals

Most often names (150), adjectives (100), verbs (50)
Word selection rules

6764 1136 hin-ben: ~63.2%

ben-hin: ~43.4%
ben: ~71.0%

6 07/18/2021 5000 Adding words from available data Adding words from available data

Word selection rules

6984 1328 hin-ben: ~65.5%

ben-hin: ~47.6%
ben: ~71.8%

7 07/25/2021 10000 ~80% hin - ben ~50% Adding words from available data Adding words from available data

Word selection rules

7075 1670 hin-ben: ~67.6%

ben-hin: ~49.6%
ben: ~72.0%

8 08/01/2021 10100 Morphological disambiguation rules for Hindi Transfer rules

Testvoc: closed categories, adv

9 08/08/2021 10200 Morphological disambiguation rules for Hindi Transfer rules

Testvoc: adj

10 08/15/2021 10300 Morphological disambiguation rules for Hindi Transfer rules

Testvoc: n

11 08/22/2021 10400 ~80% hin - ben ~65% Morphological disambiguation rules for Hindi Transfer rules

Testvoc: vblex