Ideas for Google Summer of Code/Easy dictionary maintenance

From Apertium
Jump to navigation Jump to search

This involves building an application that parses and reads the open-class (noun, adjective, verb) single-word part of the dictionary amenable to simple, data-base-like treatment, saving the remaining (hard to treat) part of the dictionaries, allows the user to easily add words (together with their inflection paradigms) through a friendly user interface and then combines the extended single-word data with the remaining data -- without loss of formatting information (e.g. XML comments etc.) -- into Apertium monolingual and bilingual dictionaries ready to be compiled.

Ideas and code from Apertium-dixtools could be useful. It could be interesting that the interface for adding new words is a web application.

It would also be interesting to some how do this with MediaWiki. For example, set up a MediaWiki installation where "paradigms" are templates, "categories" are sections and "articles/pages" are words. In MediaWiki, templates can be applied recursively, as can paradigms. Both import and export would be needed.