Difference between revisions of "User:Ifeanyi/GSoC2021 Final Report"
(One intermediate revision by the same user not shown) | |||
Line 6: | Line 6: | ||
Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html . |
Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html . |
||
Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries |
Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries, I was able to expand coverage of the Igbo analyser from a prototype analyser to one with wide coverage (although still not production-ready) |
||
===ibo morphological analyser coverage=== |
===ibo morphological analyser coverage=== |
||
Line 14: | Line 14: | ||
! Corpus |
! Corpus |
||
! Words |
! Words |
||
! Stems before |
|||
! Stems after |
|||
! Coverage before |
! Coverage before |
||
! Coverage after |
! Coverage after |
||
Line 21: | Line 19: | ||
| Wikipedia |
| Wikipedia |
||
| 511550 |
| 511550 |
||
| 85522 |
|||
| 350550 |
|||
| 19.09% |
| 19.09% |
||
| 68.52% |
| 68.52% |
||
Line 28: | Line 24: | ||
===ibo lexicon sizes Before=== |
|||
{| class="wikitable" |
|||
|- |
|||
! Lexicons |
|||
! Lexicon entries |
|||
! Patterns |
|||
! Pattern entries |
|||
|- |
|||
| 20 |
|||
| 326 |
|||
| 1 |
|||
| 10 |
|||
|} |
|||
===ibo lexicon sizes After=== |
|||
{| class="wikitable" |
|||
|- |
|||
! Lexicons |
|||
! Lexicon entries |
|||
! Patterns |
|||
! Pattern entries |
|||
|- |
|||
| 31 |
|||
| 949 |
|||
| 4 |
|||
| 20 |
|||
|} |
|||
==Future Work== |
==Future Work== |
Latest revision as of 15:47, 20 August 2021
Contents
Summary[edit]
The goal of this project was to Develop a morphological analyser for language pair for English-Igbo and write a usable version which provides intelligible output. After discussions with mentors, the best way to make the best out of Summer of Code, we decided to improve Ibo monolingual coverage package as much as possible.
Main Work[edit]
Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .
Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries, I was able to expand coverage of the Igbo analyser from a prototype analyser to one with wide coverage (although still not production-ready)
ibo morphological analyser coverage[edit]
Corpus | Words | Coverage before | Coverage after |
---|---|---|---|
Wikipedia | 511550 | 19.09% | 68.52% |
ibo lexicon sizes Before[edit]
Lexicons | Lexicon entries | Patterns | Pattern entries |
---|---|---|---|
20 | 326 | 1 | 10 |
ibo lexicon sizes After[edit]
Lexicons | Lexicon entries | Patterns | Pattern entries |
---|---|---|---|
31 | 949 | 4 | 20 |
Future Work[edit]
- Add more stems to ibo monolingual dictionary
- Add transfer rules, etc.
- Improve work in eng-ibo bidix.
Conclusion[edit]
It has been a great experience for me working with Apertium over the past ten weeks. I could get a solution or an explanation from the community to any obstacle I faced, I would like to thank the whole Apertium community, specifically, my mentors, Jonathan Washington, Mikel L. Forcada, and Nick Howell for their support, mentorship, and pointing me in the right direction