Difference between revisions of "Dictionary maintenance"

Revision as of 08:51, 28 November 2008

Including parts of dictionaries

The problem is that some parts of dictionaries that are standard between dictionaries in a pair are not kept in one file, but several (for example symbol definitions).

Solutions

Use XInclude + xmllint to preprocess two xml files into a .dix file, then validate and compile the .dix file (cy-en, en-af use this)

Different registers/varieties/standards

In some pairs, e.g. Catalan, Portuguese, there is support for generating a particular standard of a language (e.g. Brazilian Portuguese, Valencian). The way this is done may need to be looked at.

Metadix

Main article: Metadix

Lextor

Keeping monodix updated

Problem managing conflict edits for monodix

There are more and more pairs, for examples 7 pairs with English and the monodix en is copied in every pair.

What happens if developer A and developer B edit both the en monodix, say in en-fr and en-es for example? Answer: another developer has to look on both version, look the diff and try to merge. Most of the time developer A tells developer to wait a few minutes or hours, then he commits and tells to developer B that he may now copy his version and starts working.

That is time consuming. For the near future it is manageable, since there are now only a half dozen developers that regulary go on irc to solve these issues. But in the long-term, it would become harder and harder. Imagine if there are 20 or 50 pairs with English. Imagine that all developers do not want to wait. There would be different monodix.

Issues

Language specific sections of monodix files.

Ideas to solve

Language specific parts could be split out into separate files, and then XIncluded, such as currently happens in several pairs (e.g. cy-en, en-af) with the symbol definitions <sdefs>.
- A possible solution could be the sort task available in the dixtools package (--Ebenimeli 16:31, 11 July 2007 (BST)).

Suggestions

Table of contents for paradigms (It would be nice to have a kind of table of content of paradigms generated by script with a list of all paradigms in a monodix. For example "wo/nen, k/omen, etc". Words would be put in several categories : nouns, adjectives, verbs, etc.)
Interface to add new words (It would be nice to have an inteface to add new words. That would attract non-geeks)
Re-ordering of items in the dictionary (pardefs -> alphabetical order, sections -> alphabetical order, POS order etc.)
Splitting the data of monodix in several files : paradigms and lemmas or lemmas according to categories (verbs, nouns, adjectives, etc).

@@ Line 35: / Line 35: @@
 * Language specific parts could be split out into separate files, and then XIncluded, such as currently happens in several pairs (e.g. cy-en, en-af) with the symbol definitions <sdefs>.
-** <font color="green">A possible solution</font> could be the '''[[Sort a dictionary|sort task]]''' available in the '''[[Crossdics|CrossDics]]''' package (--[[User:Ebenimeli|Ebenimeli]] 16:31, 11 July 2007 (BST)).
+** <font color="green">A possible solution</font> could be the '''[[Sort a dictionary|sort task]]''' available in the '''[[dixtools]]''' package (--[[User:Ebenimeli|Ebenimeli]] 16:31, 11 July 2007 (BST)).
 === Suggestions ===

Difference between revisions of "Dictionary maintenance"

Revision as of 08:51, 28 November 2008

Contents

Including parts of dictionaries

Solutions

Different registers/varieties/standards

Metadix

Lextor

Keeping monodix updated

Issues

Ideas to solve

Suggestions

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools