Difference between revisions of "User:Ifeanyi/GSoC2021 Final Report"

From Apertium
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Summary==
==Summary==
The goal of this project was to : Develop a morphological analyser for language pair for English-Igbo and write a usable version which provides intelligible output. After discussions with mentors, the best path to make the best of Summer of Code, we decided to improve Ibo monolingual coverage package as much as possible.
The goal of this project was to Develop a morphological analyser for language pair for English-Igbo and write a usable version which provides intelligible output. After discussions with mentors, the best way to make the best out of Summer of Code, we decided to improve Ibo monolingual coverage package as much as possible.

Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .


==Main Work==
==Main Work==
Line 8: Line 6:
Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .
Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .


Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries
Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries, I was able to expand coverage of the Igbo analyser from a prototype analyser to one with wide coverage (although still not production-ready)


===Ibo morphological analyser coverage===
===ibo morphological analyser coverage===


{| class="wikitable"
{| class="wikitable"
Line 16: Line 14:
! Corpus
! Corpus
! Words
! Words
! Stems before
! Stems after
! Coverage before
! Coverage before
! Coverage after
! Coverage after
Line 23: Line 19:
| Wikipedia
| Wikipedia
| 511550
| 511550
| 85522
| 350550
| 19.09%
| 19.09%
| 68.52%
| 68.52%
Line 30: Line 24:




===ibo lexicon sizes Before===

{| class="wikitable"
|-
! Lexicons
! Lexicon entries
! Patterns
! Pattern entries
|-
| 20
| 326
| 1
| 10
|}

===ibo lexicon sizes After===

{| class="wikitable"
|-
! Lexicons
! Lexicon entries
! Patterns
! Pattern entries
|-
| 31
| 949
| 4
| 20
|}


==Future Work==
==Future Work==

Latest revision as of 15:47, 20 August 2021

Summary[edit]

The goal of this project was to Develop a morphological analyser for language pair for English-Igbo and write a usable version which provides intelligible output. After discussions with mentors, the best way to make the best out of Summer of Code, we decided to improve Ibo monolingual coverage package as much as possible.

Main Work[edit]

Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .

Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries, I was able to expand coverage of the Igbo analyser from a prototype analyser to one with wide coverage (although still not production-ready)

ibo morphological analyser coverage[edit]

Corpus Words Coverage before Coverage after
Wikipedia 511550 19.09% 68.52%


ibo lexicon sizes Before[edit]

Lexicons Lexicon entries Patterns Pattern entries
20 326 1 10

ibo lexicon sizes After[edit]

Lexicons Lexicon entries Patterns Pattern entries
31 949 4 20

Future Work[edit]

  • Add more stems to ibo monolingual dictionary
  • Add transfer rules, etc.
  • Improve work in eng-ibo bidix.


Conclusion[edit]

It has been a great experience for me working with Apertium over the past ten weeks. I could get a solution or an explanation from the community to any obstacle I faced, I would like to thank the whole Apertium community, specifically, my mentors, Jonathan Washington, Mikel L. Forcada, and Nick Howell for their support, mentorship, and pointing me in the right direction