OmegaWiki
The OmegaWiki database layout is pretty dreadful, hopefully this will make things slightly easier for anyone brave enough to look near it. You might want to read this for a description of "Defined Meaning".
Retrieving a list of POS tags
First find the language which you would like to retrieve the POS tags for:
mysql> select * from language_names where language_name = 'Welsh'; +-------------+------------------+---------------+ | language_id | name_language_id | language_name | +-------------+------------------+---------------+ | 153 | 85 | Welsh | | 153 | 89 | Welsh | +-------------+------------------+---------------+ 2 rows in set (0.00 sec)
So the language_id
is '153', we'll need to use this later on.
Now we need to retrieve the list of parts of speech, to do this we need 3 tables:
mysql> select option_id,attribute_id,option_mid,uw_option_attribute_options.language_id,uw_defined_meaning.expression_id,spelling -> from uw_option_attribute_options,uw_defined_meaning,uw_expression_ns -> where attribute_id = '409106' and uw_option_attribute_options.language_id = '153' and -> uw_defined_meaning.defined_meaning_id = option_mid and uw_expression_ns.expression_id = -> uw_defined_meaning.expression_id; +-----------+--------------+------------+-------------+---------------+-----------+ | option_id | attribute_id | option_mid | language_id | expression_id | spelling | +-----------+--------------+------------+-------------+---------------+-----------+ | 435748 | 409106 | 5612 | 153 | 121924 | noun | | 435751 | 409106 | 6100 | 153 | 124600 | verb | | 435753 | 409106 | 6102 | 153 | 124610 | adjective | +-----------+--------------+------------+-------------+---------------+-----------+ 3 rows in set (0.00 sec)
uw_expression_ns.spelling
is the way the word is spelt.uw_defined_meaning.defined_meaning_id
is the "defined meaning" of the part of speech, e.g. it describes what a "verb" is, or an "adjective".uw_option_attribute_options.attribute_id
defines that this "defined meaning" is a "part of speech" option.
Retrieving a list of lemmata that match a POS tag
So, lets retrieve all Welsh nouns!
First retrive the option_id
of the POS tag from the uw_option_attribute_options
table:
Remember, option_mid
is the defined meaning of the part of speech that you want, in this case '5612' is "noun".
mysql> select option_id,attribute_id,option_mid,language_id -> from uw_option_attribute_options -> where option_mid = '5612' and language_id = '153'; +-----------+--------------+------------+-------------+ | option_id | attribute_id | option_mid | language_id | +-----------+--------------+------------+-------------+ | 435748 | 409106 | 5612 | 153 | +-----------+--------------+------------+-------------+
Now to retrieve the list of nouns. We need to take the option_id
from above, and then paste it into this query!
Note: This query could take over a minute, so go to grab a cup of coffee or something!
mysql> select value_id,object_id,uw_defined_meaning.defined_meaning_id,spelling -> from uw_option_attribute_values,uw_syntrans,uw_defined_meaning,uw_expression_ns -> where uw_option_attribute_values.option_id = '435748' -> and uw_syntrans.syntrans_sid = uw_option_attribute_values.object_id -> and uw_defined_meaning.defined_meaning_id = uw_syntrans.defined_meaning_id -> and uw_expression_ns.expression_id = uw_syntrans.expression_id; +----------+-----------+--------------------+--------------+ | value_id | object_id | defined_meaning_id | spelling | +----------+-----------+--------------------+--------------+ | 437913 | 437904 | 437893 | bargen | | 438025 | 438006 | 437948 | methdaliad | | 438988 | 438983 | 5930 | gallu | | 439078 | 439059 | 439017 | siasi | | 439079 | 439061 | 439017 | ffrâm | | 440330 | 440318 | 440185 | diplomydd | | 442533 | 442508 | 442442 | diffynnydd | | 444812 | 444805 | 444787 | trethdalwr | | 444887 | 444874 | 444834 | traddodiadol | | 473807 | 473789 | 473754 | cydbwysedd | | 474801 | 474791 | 474762 | menter | | 475455 | 475442 | 475412 | gwirfoddolwr | +----------+-----------+--------------------+--------------+
This assumes that the syntrans_sid
is the same as the uw_option_attribute_values.object_id
which may not always be the case.
Retrieve a list of translations
You need to know the defined_meaning_id
and the language_id
of the language that you would like the language_name
shown in.
mysql> select syntrans_sid,defined_meaning_id,defined_meaning_id,identical_meaning,language_name,spelling -> from uw_syntrans,uw_expression_ns,language_names -> where defined_meaning_id = '437893' -> and uw_expression_ns.expression_id = uw_syntrans.expression_id -> and language_names.language_id = uw_expression_ns.language_id -> and language_names.name_language_id = '85'; +--------------+--------------------+--------------------+-------------------+---------------+-----------+ | syntrans_sid | defined_meaning_id | defined_meaning_id | identical_meaning | language_name | spelling | +--------------+--------------------+--------------------+-------------------+---------------+-----------+ | 437894 | 437893 | 437893 | 1 | English | bargain | | 437897 | 437893 | 437893 | 1 | Dutch | koopje | | 437898 | 437893 | 437893 | 1 | French | occasion | | 437900 | 437893 | 437893 | 1 | Japanese | お買得 | | 437902 | 437893 | 437893 | 1 | Castilian | ganga | | 437904 | 437893 | 437893 | 1 | Welsh | bargen | | 438194 | 437893 | 437893 | 1 | French | affaire | +--------------+--------------------+--------------------+-------------------+---------------+-----------+
The identical_meaning
flag means that the meaning of the words is thought to be identical in each language.