Webforms

From Apertium
Revision as of 11:48, 10 October 2007 by 81.2.93.121 (talk) (New page: Some notes on a web interface to add vocabulary to the dictionaries ... The main difficulty is that one format (ie set of boxes) may not suit all languages. Assuming "flavours" of the ba...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Some notes on a web interface to add vocabulary to the dictionaries ...

The main difficulty is that one format (ie set of boxes) may not suit all languages.

Assuming "flavours" of the basic model, for Welsh the relevant things that occur to me might be:

POS Specifying this at the beginning would select rules or boxsets for the subsequent entries, ideally using Ajax (or at least CSS if done manually) to hide those that aren't needed. Ideally, this would be set to remember the choice, so that you don't have to keep unticking the checkbox for "noun" when you want to do an "adjective".

Welsh word (for nouns) Welsh word plural A routine behind the scenes would "subtract" one from the other to get the split-point.

English meaning English clarification

Gender

(for adjectives) Degree

(for prepositions) Inflected forms

One addition might be to have a "jumbo" textfield, where you can put in the info in a set sequence, separated by tabs or commas, so you don't have to move from field to field. This might also allow the entry of multiword phrases (set phrases).

This would also allow a text file in that format to be uploaded and entered automatically.

Once entered, the data should be held in a list (together with any deduced additional info that the backend has come up with - see below), so that each entry can be looked at again and edited if necessary. Once the user is content with the list, pressing one button would then upload the checked entries to the "live" dictionary.

The behind-the-scenes stuff is going to have to be fairly complex, because of the nature of the .dic files. Apart from the split-point, we will need to select a suitable paradigm, and (for nouns, adjectives and verbs) apply a routine for mutation.

Ideally, there should also be an option to export all data in the dictionary as a csv or text file, so that data entered here can be reused, and not disappear into a black hole. This should be fairly easy, in that the word without split point would be selected, have the plural generated, and then add the other info like POS, gender, number, etc.

Where inflected verbs are concerned, there is a different level of complexity, and perhaps it would be best to leave that until we have the citation words interface sorted.