Difference between revisions of "OmegaWiki"
Line 1: | Line 1: | ||
The OmegaWiki database layout is pretty dreadful, hopefully this will make things slightly easier for anyone brave enough to look near it. |
The OmegaWiki database layout is pretty dreadful, hopefully this will make things slightly easier for anyone brave enough to look near it. You might want to read [http://www.omegawiki.org/DefinedMeaning this] for a description of "Defined Meaning". |
||
==Retrieving a list of POS tags== |
==Retrieving a list of POS tags== |
Revision as of 12:48, 30 July 2007
The OmegaWiki database layout is pretty dreadful, hopefully this will make things slightly easier for anyone brave enough to look near it. You might want to read this for a description of "Defined Meaning".
Retrieving a list of POS tags
First find the language which you would like to retrieve the POS tags for:
mysql> select * from language_names where language_name = 'Welsh'; +-------------+------------------+---------------+ | language_id | name_language_id | language_name | +-------------+------------------+---------------+ | 153 | 85 | Welsh | | 153 | 89 | Welsh | +-------------+------------------+---------------+ 2 rows in set (0.00 sec)
So the language_id
is '153', we'll need to use this later on.
Now we need to retrieve the list of parts of speech, to do this we need 3 tables:
mysql> select option_id,attribute_id,option_mid,uw_option_attribute_options.language_id,uw_defined_meaning.expression_id,spelling -> from uw_option_attribute_options,uw_defined_meaning,uw_expression_ns -> where attribute_id = '409106' and uw_option_attribute_options.language_id = '153' and -> uw_defined_meaning.defined_meaning_id = option_mid and uw_expression_ns.expression_id = -> uw_defined_meaning.expression_id; +-----------+--------------+------------+-------------+---------------+-----------+ | option_id | attribute_id | option_mid | language_id | expression_id | spelling | +-----------+--------------+------------+-------------+---------------+-----------+ | 435748 | 409106 | 5612 | 153 | 121924 | noun | | 435751 | 409106 | 6100 | 153 | 124600 | verb | | 435753 | 409106 | 6102 | 153 | 124610 | adjective | +-----------+--------------+------------+-------------+---------------+-----------+ 3 rows in set (0.00 sec)
uw_expression_ns.spelling
is the way the word is spelt.uw_defined_meaning.defined_meaning_id
is the "defined meaning" of the part of speech, e.g. it describes what a "verb" is, or an "adjective".uw_option_attribute_options.attribute_id
defines that this "defined meaning" is a "part of speech" option.
Retrieving a list of lemmata that match a POS tag
So, lets retrieve all Welsh nouns!
First retrive the option_id
of the POS tag from the uw_option_attribute_options
table:
Remember, option_mid
is the defined meaning of the part of speech that you want, in this case '5612' is "noun".
mysql> select option_id,attribute_id,option_mid,language_id -> from uw_option_attribute_options -> where option_mid = '5612' and language_id = '153'; +-----------+--------------+------------+-------------+ | option_id | attribute_id | option_mid | language_id | +-----------+--------------+------------+-------------+ | 435748 | 409106 | 5612 | 153 | +-----------+--------------+------------+-------------+
Now to retrieve the list of nouns. We need to take the option_id
from above, and then paste it into this query!
Note: This query could take over a minute, so go to grab a cup of coffee or something!
mysql> select value_id,object_id,uw_defined_meaning.defined_meaning_id,spelling -> from uw_option_attribute_values,uw_syntrans,uw_defined_meaning,uw_expression_ns -> where uw_option_attribute_values.option_id = '435748' -> and uw_syntrans.syntrans_sid = uw_option_attribute_values.object_id -> and uw_defined_meaning.defined_meaning_id = uw_syntrans.defined_meaning_id -> and uw_expression_ns.expression_id = uw_syntrans.expression_id; +----------+-----------+--------------------+--------------+ | value_id | object_id | defined_meaning_id | spelling | +----------+-----------+--------------------+--------------+ | 437913 | 437904 | 437893 | bargen | | 438025 | 438006 | 437948 | methdaliad | | 438988 | 438983 | 5930 | gallu | | 439078 | 439059 | 439017 | siasi | | 439079 | 439061 | 439017 | ffrâm | | 440330 | 440318 | 440185 | diplomydd | | 442533 | 442508 | 442442 | diffynnydd | | 444812 | 444805 | 444787 | trethdalwr | | 444887 | 444874 | 444834 | traddodiadol | | 473807 | 473789 | 473754 | cydbwysedd | | 474801 | 474791 | 474762 | menter | | 475455 | 475442 | 475412 | gwirfoddolwr | +----------+-----------+--------------------+--------------+
This assumes that the syntrans_sid
is the same as the uw_option_attribute_values.object_id
which may not always be the case.