Difference between revisions of "User:Gourab337/GSoC2021-Workplan-Control"
Hectoralos (talk | contribs) |
|||
(10 intermediate revisions by 2 users not shown) | |||
Line 39: | Line 39: | ||
| |
| |
||
| |
| |
||
|apertium-ben: |
|apertium-ben:<br> |
||
Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn |
Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn<br> |
||
Add/check words: pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn |
Add/check words: pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn |
||
|pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn |
|pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn |
||
⚫ | |||
|8370 |
|||
⚫ | |||
⚫ | |||
post: 70 |
|||
cnj: 87 |
|||
num: 123 |
|||
det: 68 |
|||
prn: 52 |
|||
|762 |
|||
⚫ | |||
post: 53 |
|||
cnj: 123 |
|||
num: 2 |
|||
det: 31 |
|||
prn: 66 |
|||
| |
| |
||
| |
|hin-ben: ~33.3%<br> |
||
ben: ~ |
ben-hin: ~20.4%<br> |
||
ben: ~67.9% |
|||
| |
| |
||
| |
| |
||
Line 72: | Line 59: | ||
| |
| |
||
|preparing scripts for adding words from the available free data into the dictionaries |
|preparing scripts for adding words from the available free data into the dictionaries |
||
|6637 |
|||
|895 |
|||
| |
| |
||
⚫ | |||
| |
|||
ben-hin: ~29.7%<br> |
|||
| |
|||
ben: ~67.7% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 85: | Line 74: | ||
| |
| |
||
| |
| |
||
|Key transfer rules hin > ben to avoid # |
|Key transfer rules hin > ben to avoid #<br> |
||
Eventually: the same for ben > hin<br> |
|||
Manual disambiguation of Hindi texts |
Manual disambiguation of Hindi texts |
||
|6640 |
|||
|931 |
|||
| |
| |
||
|hin-ben: ~39.5%<br> |
|||
| |
|||
ben-hin: ~34.0%<br> |
|||
| |
|||
ben: ~69.9% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 102: | Line 93: | ||
| |
| |
||
|Manual disambiguation of Hindi texts |
|Manual disambiguation of Hindi texts |
||
|6687 |
|||
|917 |
|||
| |
| |
||
|hin-ben: ~44.5%<br> |
|||
| |
|||
ben-hin: ~39.3%<br> |
|||
| |
|||
ben: ~70.0% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 114: | Line 107: | ||
| |
| |
||
| |
| |
||
|apertium-ben: |
|apertium-ben:<br> |
||
ordinals |
ordinals<br> |
||
Manual adding of most often names (150), adjectives (100), verbs (50) |
Manual adding of most often names (150), adjectives (100), verbs (50) |
||
| |
|ordinals<br> |
||
Most often names (150), adjectives (100), verbs (50) |
Most often names (150), adjectives (100), verbs (50)<br> |
||
Word selection rules |
Word selection rules |
||
|6764 |
|||
|1136 |
|||
| |
| |
||
|hin-ben: ~63.2%<br> |
|||
| |
|||
ben-hin: ~43.4%<br> |
|||
| |
|||
ben: ~71.0% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 133: | Line 128: | ||
| |
| |
||
|Adding words from available data |
|Adding words from available data |
||
|Adding words from available data |
|Adding words from available data<br> |
||
Word selection rules |
Word selection rules |
||
|6984 |
|||
|1328 |
|||
| |
| |
||
|hin-ben: ~65.5%<br> |
|||
| |
|||
ben-hin: ~47.6%<br> |
|||
| |
|||
ben: ~71.8% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 148: | Line 145: | ||
|hin - ben ~50% |
|hin - ben ~50% |
||
|Adding words from available data |
|Adding words from available data |
||
|Adding words from available data |
|Adding words from available data<br> |
||
Word selection rules |
Word selection rules |
||
|7075 |
|||
|1670 |
|||
| |
| |
||
|hin-ben: ~67.6%<br> |
|||
| |
|||
ben-hin: ~49.6%<br> |
|||
| |
|||
ben: ~72.0% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 163: | Line 162: | ||
| |
| |
||
|Morphological disambiguation rules for Hindi |
|Morphological disambiguation rules for Hindi |
||
|Transfer rules |
|Transfer rules<br> |
||
Testvoc: closed categories, adv |
Testvoc: closed categories, adv |
||
|7078 |
|||
|1718 |
|||
| |
| |
||
|hin-ben: ~67.8%<br> |
|||
| |
|||
ben-hin: ~49.7%<br> |
|||
| |
|||
ben: ~72.0% |
|||
| |
|||
| |
| |
||
| |
| |
||
Line 178: | Line 179: | ||
| |
| |
||
|Morphological disambiguation rules for Hindi |
|Morphological disambiguation rules for Hindi |
||
|Transfer rules |
|Transfer rules<br> |
||
Testvoc: adj |
Testvoc: adj |
||
| |
| |
||
Line 193: | Line 194: | ||
| |
| |
||
|Morphological disambiguation rules for Hindi |
|Morphological disambiguation rules for Hindi |
||
|Transfer rules |
|Transfer rules<br> |
||
Testvoc: n |
Testvoc: n |
||
| |
| |
||
Line 208: | Line 209: | ||
|hin - ben ~65% |
|hin - ben ~65% |
||
|Morphological disambiguation rules for Hindi |
|Morphological disambiguation rules for Hindi |
||
| |
|Transfer rules<br> |
||
Testvoc: vblex |
Testvoc: vblex |
||
| |
|||
| |
|||
| |
|||
| |
|||
| |
|||
| |
|||
|- |
|||
|11 |
|||
|07/25/2021 |
|||
|10000 |
|||
|~80% |
|||
⚫ | |||
|Adding words from available data |
|||
|Adding words from available data |
|||
Word selection rules |
|||
| |
| |
||
| |
| |
Latest revision as of 18:37, 3 August 2021
Workplan | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Week | Dates | Goals | Fulfilled | |||||||||
Bidix
(excluding proper names) |
Coverage | WER | Monlingual dictionaries | Bilingual dictionary / repository | ben monodix
(excl. proper names) |
Bidix
(excl. proper names) |
Non-WP
coverage (%) |
WP
coverage (%) |
WER
(%) |
Testvoc
(clean %) --- Manual disamb. (words) | ||
1 | 06/13/2021 | 500 | apertium-ben: Main paradigms: n, adj, vblex, vbser, adv, pr, post, cnjcoo, cnjsub, cnjadv, det, num, prn |
pr, post, cnjcoo, cnjsub, cnjsub, num, det, prn | 6603 | 756 | hin-ben: ~33.3% ben-hin: ~20.4% |
|||||
2 | 06/20/2021 | 500 | preparing scripts for adding words from the available free data into the dictionaries | 6637 | 895 | hin-ben: ~40.1% ben-hin: ~29.7% |
||||||
3 | 06/27/2021 | 500 | Key transfer rules hin > ben to avoid # Eventually: the same for ben > hin |
6640 | 931 | hin-ben: ~39.5% ben-hin: ~34.0% |
||||||
4 | 07/04/2021 | 500 | Manual disambiguation of Hindi texts | 6687 | 917 | hin-ben: ~44.5% ben-hin: ~39.3% |
||||||
5 | 07/11/2021 | 800 | apertium-ben: ordinals |
ordinals Most often names (150), adjectives (100), verbs (50) |
6764 | 1136 | hin-ben: ~63.2% ben-hin: ~43.4% |
|||||
6 | 07/18/2021 | 5000 | Adding words from available data | Adding words from available data Word selection rules |
6984 | 1328 | hin-ben: ~65.5% ben-hin: ~47.6% |
|||||
7 | 07/25/2021 | 10000 | ~80% | hin - ben ~50% | Adding words from available data | Adding words from available data Word selection rules |
7075 | 1670 | hin-ben: ~67.6% ben-hin: ~49.6% |
|||
8 | 08/01/2021 | 10100 | Morphological disambiguation rules for Hindi | Transfer rules Testvoc: closed categories, adv |
7078 | 1718 | hin-ben: ~67.8% ben-hin: ~49.7% |
|||||
9 | 08/08/2021 | 10200 | Morphological disambiguation rules for Hindi | Transfer rules Testvoc: adj |
||||||||
10 | 08/15/2021 | 10300 | Morphological disambiguation rules for Hindi | Transfer rules Testvoc: n |
||||||||
11 | 08/22/2021 | 10400 | ~80% | hin - ben ~65% | Morphological disambiguation rules for Hindi | Transfer rules Testvoc: vblex |