Difference between revisions of "OmegaWiki"
| Line 87: | Line 87: | ||
+----------+-----------+--------------------+-------------+ |
+----------+-----------+--------------------+-------------+ |
||
</pre> |
</pre> |
||
This assumes that the <code>syntrans_sid</code> is the same as the <code>uw_option_attribute_values.object_id</code> which may not always be the case. |
|||
Revision as of 11:54, 30 July 2007
The OmegaWiki database layout is pretty dreadful, hopefully this will make things slightly easier for anyone brave enough to look near it.
Retrieving a list of POS tags
First find the language which you would like to retrieve the POS tags for:
mysql> select * from language_names where language_name = 'Welsh'; +-------------+------------------+---------------+ | language_id | name_language_id | language_name | +-------------+------------------+---------------+ | 153 | 85 | Welsh | | 153 | 89 | Welsh | +-------------+------------------+---------------+ 2 rows in set (0.00 sec)
So the language_id is '153', we'll need to use this later on.
Now we need to retrieve the list of parts of speech, to do this we need 3 tables:
mysql> select option_id,attribute_id,option_mid,uw_option_attribute_options.language_id,uw_defined_meaning.expression_id,spelling
-> from uw_option_attribute_options,uw_defined_meaning,uw_expression_ns
-> where attribute_id = '409106' and uw_option_attribute_options.language_id = '153' and
-> uw_defined_meaning.defined_meaning_id = option_mid and uw_expression_ns.expression_id =
-> uw_defined_meaning.expression_id;
+-----------+--------------+------------+-------------+---------------+-----------+
| option_id | attribute_id | option_mid | language_id | expression_id | spelling |
+-----------+--------------+------------+-------------+---------------+-----------+
| 435748 | 409106 | 5612 | 153 | 121924 | noun |
| 435751 | 409106 | 6100 | 153 | 124600 | verb |
| 435753 | 409106 | 6102 | 153 | 124610 | adjective |
+-----------+--------------+------------+-------------+---------------+-----------+
3 rows in set (0.00 sec)
uw_expression_ns.spellingis the way the word is spelt.uw_defined_meaning.defined_meaning_idis the "defined meaning" of the part of speech, e.g. it describes what a "verb" is, or an "adjective".uw_option_attribute_options.attribute_iddefines that this "defined meaning" is a "part of speech" option.
Retrieving a list of lemmata that match a POS tag
So, lets retrieve all Welsh nouns!
First retrive the option_id of the POS tag from the uw_option_attribute_options table:
Remember, option_mid is the defined meaning of the part of speech that you want, in this case '5612' is "noun".
mysql> select option_id,attribute_id,option_mid,language_id
-> from uw_option_attribute_options
-> where option_mid = '5612' and language_id = '153';
+-----------+--------------+------------+-------------+
| option_id | attribute_id | option_mid | language_id |
+-----------+--------------+------------+-------------+
| 435748 | 409106 | 5612 | 153 |
+-----------+--------------+------------+-------------+
Now to retrieve the list of nouns. We need to take the option_id from above, and then paste it into this query!
Note: This query could take over a minute, so go to grab a cup of coffee or something!
mysql> select value_id,object_id,uw_defined_meaning.defined_meaning_id,spelling
-> from uw_option_attribute_values,uw_syntrans,uw_defined_meaning,uw_expression_ns
-> where uw_option_attribute_values.option_id = '435748'
-> and uw_syntrans.syntrans_sid = uw_option_attribute_values.object_id
-> and uw_defined_meaning.defined_meaning_id = uw_syntrans.defined_meaning_id
-> and uw_expression_ns.expression_id = uw_defined_meaning.expression_id;
+----------+-----------+--------------------+-------------+
| value_id | object_id | defined_meaning_id | spelling |
+----------+-----------+--------------------+-------------+
| 438988 | 438983 | 5930 | skill |
| 437913 | 437904 | 437893 | bargain |
| 438025 | 438006 | 437948 | bankruptcy |
| 439078 | 439059 | 439017 | chassis |
| 439079 | 439061 | 439017 | chassis |
| 440330 | 440318 | 440185 | diplomat |
| 442533 | 442508 | 442442 | defendant |
| 444812 | 444805 | 444787 | taxpayer |
| 444887 | 444874 | 444834 | traditional |
| 473807 | 473789 | 473754 | equilibrium |
| 474801 | 474791 | 474762 | enterprise |
| 475455 | 475442 | 475412 | volunteer |
+----------+-----------+--------------------+-------------+
This assumes that the syntrans_sid is the same as the uw_option_attribute_values.object_id which may not always be the case.