Difference between revisions of "Ideas for Google Summer of Code/Morphological analyser"
Jump to navigation
Jump to search
(write a basic summary of morphanalyser project) |
|||
Line 18: | Line 18: | ||
Present your coding challenge in IRC or on the mailing list and ask |
Present your coding challenge in IRC or on the mailing list and ask |
||
for feedback. |
for feedback. |
||
[[Category:Ideas for Google Summer of Code]] |
Latest revision as of 15:27, 5 April 2021
Implement a transducer-based morphological analyser/generator for a new language.
Tentative requirements:
- a morphology for your language
- a dictionary (digital helps; expected 10k lexemes)
- corpus of text (50k+ tokens which can be made public)
These can be adjusted with the consent of the mentor.
Coding challenge:
Take an excerpt of 200-300 tokens from your corpus and implement an analyser for it. The analyser should completely analyse at least one sentence, and you should aim for as close to complete coverage of the excerpt as possible.
Present your coding challenge in IRC or on the mailing list and ask for feedback.