Difference between revisions of "User:David Nemeskey/CG XML brainstorming"
(Created page with 'This page lists my (and others') ideas of how the CG XML format could or should look like. == Sets and lists == The words ''set'' and ''list'' are used interchangeably in CG. T…') |
|||
Line 17: | Line 17: | ||
|- |
|- |
||
|Sequence tag |
|Sequence tag |
||
|'' |
|(''n pl'') |
||
|'''<code><seq><tag></code>'''n'''<code></tag><tag></code>'''pl'''<code></tag></seq></code>''' |
|'''<code><seq><tag></code>'''''n'''''<code></tag><tag></code>'''''pl'''''<code></tag></seq></code>''' |
||
|- |
|- |
||
|Reading base-form |
|Reading base-form |
||
|'' |
|"''dog''" |
||
|'''<code><lemma></code>'''dog'''<code></lemma></code>''' |
|'''<code><lemma></code>'''''dog'''''<code></lemma></code>''' |
||
|- |
|- |
||
|Word-form |
|Word-form |
||
| |
|"<''dogs''>" |
||
|'''<code><word></code>'''dogs'''<code></word></code>''' |
|'''<code><word></code>'''''dogs'''''<code></word></code>''' |
||
|- |
|- |
||
|Set |
|Set |
||
|'' |
|(''...'') |
||
|'''<code><set></code>'''...'''<code></set></code>''' |
|'''<code><set></code>'''''...'''''<code></set></code>''' |
||
|- |
|||
|Special tags |
|||
|''>>>'' and ''<<<'' |
|||
|'''<code><sbegin/></code>''' and '''<code><send/></code>''' |
|||
|} |
|} |
||
Line 36: | Line 40: | ||
# '''<code>seq</code>''' and '''<code>set</code>''' are very similar, which might be a problem when skimming through a CG |
# '''<code>seq</code>''' and '''<code>set</code>''' are very similar, which might be a problem when skimming through a CG |
||
# I don't know if we even need '''<code>set</code>''' -- in the construction rules, you have to put sets to everywhere, and those will have separate XML tags anyway. |
# I don't know if we even need '''<code>set</code>''' -- in the construction rules, you have to put sets to everywhere, and those will have separate XML tags anyway. |
||
# '''<code>seq</code>''' could be '''<code>combined(-tag)</code>'''? |
|||
== Delimiters == |
== Delimiters == |
||
Probably the easiest of the bunch |
Probably the easiest of the bunch: |
||
<code>'''<delimiters>'''(word forms, sets, etc.)'''</delimiters>'''</code> |
<code>'''<delimiters>'''(word forms, sets, etc.)'''</delimiters>'''</code> |
||
== Sets == |
|||
Set definitions and modifications. The section itself in enclosed in a <code>'''<sets>'''...'''</sets>'''</code> tag. |
|||
{| class="wikitable" |
|||
! Item |
|||
! CG syntax |
|||
! XML syntax |
|||
|- |
|||
|Set definition |
|||
|<code>LIST ''set-name'' = ''...'' ;</code> |
|||
|'''<code><define-set name="</code>'''''set-name'''''<code>"></code>'''''...'''''<code></define-set></code>'''<br> |
|||
'''<code><dset name="</code>'''''set-name'''''<code>"></code>'''''...'''''<code></dset></code>''' |
|||
|- |
|||
|Set modification |
|||
|<code>SET ''set-name'' = ''...'' ;</code> |
|||
|'''<code><modify-set name="</code>'''''set-name'''''<code>"></code>'''''...'''''<code></modify-set></code>'''<br> |
|||
'''<code><mset name="</code>'''''set-name'''''<code>"></code>'''''...'''''<code></mset></code>''' |
|||
|} |
|||
The ''...'' in set modification can include the following set operations: |
|||
{| class="wikitable" |
|||
! Operation |
|||
! CG syntax |
|||
! XML syntax |
|||
|- |
|||
|Union |
|||
|''A OR B'' |
|||
|'''<code><union></code>???A???B???<code></union></code>'''<br> |
|||
'''<code><or></code>???A???B???<code><or></code>''' |
|||
|- |
|||
|Concatenation |
|||
|''A + B'' |
|||
|'''<code><concat></code>???A???B???<code></concat></code>''' |
|||
|- |
|||
|Difference |
|||
|''A - B'' |
|||
|'''<code><diff></code>???A???B???<code><diff></code>''' |
|||
|- |
|||
|} |
|||
Note: I imagine the above to be akin to lisp operators, e.g. <code>(or A (concat B C) (diff D E))</code>. This format has the benefit of explicitly encoding the precedence in the formula, so grammarians won't have to memorize it. |
|||
== Constraints == |
Revision as of 18:10, 27 June 2013
This page lists my (and others') ideas of how the CG XML format could or should look like.
Contents
Sets and lists
The words set and list are used interchangeably in CG. This is in contrast to how these term are used in CS, and partly to the commonsensical meanings of the words as well. The current planning process might be just the right time to fix this issue. I propose to say good-bye to list.
The (XML) tags below will be used throughout the grammar for specifying tags and sets in e.g. constraint conditions.
Item | CG syntax | XML syntax |
---|---|---|
Regular tag | nom | <tag> nom</tag>
|
Sequence tag | (n pl) | <seq><tag> n</tag><tag> pl</tag></seq>
|
Reading base-form | "dog" | <lemma> dog</lemma>
|
Word-form | "<dogs>" | <word> dogs</word>
|
Set | (...) | <set> ...</set>
|
Special tags | >>> and <<< | <sbegin/> and <send/>
|
Observations:
seq
andset
are very similar, which might be a problem when skimming through a CG- I don't know if we even need
set
-- in the construction rules, you have to put sets to everywhere, and those will have separate XML tags anyway. seq
could becombined(-tag)
?
Delimiters
Probably the easiest of the bunch:
<delimiters>(word forms, sets, etc.)</delimiters>
Sets
Set definitions and modifications. The section itself in enclosed in a <sets>...</sets>
tag.
Item | CG syntax | XML syntax |
---|---|---|
Set definition | LIST set-name = ... ;
|
<define-set name=" set-name"> ...</define-set>
|
Set modification | SET set-name = ... ;
|
<modify-set name=" set-name"> ...</modify-set>
|
The ... in set modification can include the following set operations:
Operation | CG syntax | XML syntax |
---|---|---|
Union | A OR B | <union> ???A???B???</union>
|
Concatenation | A + B | <concat> ???A???B???</concat>
|
Difference | A - B | <diff> ???A???B???<diff>
|
Note: I imagine the above to be akin to lisp operators, e.g. (or A (concat B C) (diff D E))
. This format has the benefit of explicitly encoding the precedence in the formula, so grammarians won't have to memorize it.