User:Ifeanyi/GSoC2021 Final Report

From Apertium
Jump to navigation Jump to search

Summary[edit]

The goal of this project was to Develop a morphological analyser for language pair for English-Igbo and write a usable version which provides intelligible output. After discussions with mentors, the best way to make the best out of Summer of Code, we decided to improve Ibo monolingual coverage package as much as possible.

Main Work[edit]

Most of the work that had been collected at the end of GSoC program can be found here : https://apertium.projectjj.com/gsoc2021/ifeanyijasper.html .

Most part of the work done on the ibo language was its monodix. This consisted of adding stems to dictionaries, I was able to expand coverage of the Igbo analyser from a prototype analyser to one with wide coverage (although still not production-ready)

ibo morphological analyser coverage[edit]

Corpus Words Coverage before Coverage after
Wikipedia 511550 19.09% 68.52%


ibo lexicon sizes Before[edit]

Lexicons Lexicon entries Patterns Pattern entries
20 326 1 10

ibo lexicon sizes After[edit]

Lexicons Lexicon entries Patterns Pattern entries
31 949 4 20

Future Work[edit]

  • Add more stems to ibo monolingual dictionary
  • Add transfer rules, etc.
  • Improve work in eng-ibo bidix.


Conclusion[edit]

It has been a great experience for me working with Apertium over the past ten weeks. I could get a solution or an explanation from the community to any obstacle I faced, I would like to thank the whole Apertium community, specifically, my mentors, Jonathan Washington, Mikel L. Forcada, and Nick Howell for their support, mentorship, and pointing me in the right direction