Difference between revisions of "VM for transfer"

From Apertium
Jump to navigation Jump to search
(REtYRlbVaYmhtpatG)
m (Reverted edits by 190.8.32.47 (Talk) to last revision by Darthxaher)
Line 1: Line 1:
  +
== Instruction Sets ==
DsL5oS <a href="http://ihmpaqvhglra.com/">ihmpaqvhglra</a>, [url=http://evgxzufpfqmv.com/]evgxzufpfqmv[/url], [link=http://xzlfllmibupn.com/]xzlfllmibupn[/link], http://xxqaynttdjgz.com/
 
  +
  +
{| class="wikitable" border="1"
  +
! Mnemonic !! Opcode<br>''(in hex)'' !! Other operands !! Stack<br>[before]&rarr;[after] !! Description
  +
|-
  +
| push || - || value || [empty] &rarr; value || Pushes a value in stack
  +
|-
  +
| pushv || - || var || [empty] &rarr; value || Evaluates the var and pushes its value in stack
  +
|-
  +
| pusht || - || var || [empty] &rarr; <value> || Evaluates the var and pushes its value as a tag in stack
  +
|-
  +
| pushbl || - || N/A || [empty] &rarr; blank || pushes a blank in the stack
  +
|-
  +
| pushsb || - || pos || [empty] &rarr; superblank || pushes the superblank at 'pos' in stack
  +
|-
  +
| pushz || - || N/A || [empty] &rarr; [zero_flag] || pushes the current value of zero_flag in stack
  +
|-
  +
| pushnz || - || N/A || [empty] &rarr; [not_zero_flag] || first takes the NOT of the current value of zero_flag, then pushes the value in stack
  +
|-
  +
| cliptl || - || N/A || pos, regex &rarr; value || Matches 'regex' in target language 'pos' and pushes the value in stack
  +
|-
  +
| clipsl || - || N/A || pos, regex &rarr; value || Matches 'regex' in source language 'pos' and pushes the value in stack
  +
|-
  +
| storetl || - || N/A || pos, regex, data &rarr; value || Replace 'regex' in source language 'pos' with 'data'
  +
|-
  +
| addtrie || - || address || pattern, pattern, ..., no_of_patterns &rarr; [empty] || Pops 'no_of_pattern' amount of data from the stack, combine these patterns, add that to the trie pointing to given 'address'
  +
|-
  +
| lu || - || num || lemma, tag1, ..., tagn &rarr; ^(lexical_unit)$ || Pops 'num' amount of data from the stack and creates a lexical unit ^... ...$ with them, pushes the lu back in the stack
  +
|-
  +
| brace || - || num || lu1, blank1, lu2, blank2, ..., lun &rarr; {... ...} || Pops 'num' amount of data from the stack and creates the braced version {... ... ...}, pushes it back in the stack
  +
|-
  +
| chunk || - || num || chunk_name, tag1, tag2, ... , {^... ...$} &rarr; ^chunk_name<tag1>...<tagn>{^... ...$}$ || Pops 'num' amount of data from the stack and creates the chunk, pushes back in the stack
  +
|-
  +
| out || - || num || chunk1, chunk2, ... &rarr; [empty] || Pops 'num' amount of data from the stack and puts then in standard output
  +
|-
  +
| cmpi || - || N/A || data1, data2 &rarr; [empty] || Pops data1 and data2, string compares them (ignorecase), if matches (successful), set zero flag to 1 (it means we have a zero)
  +
|-
  +
| cmp || - || N/A || data1, data2 &rarr; [empty] || Pops data1 and data2, string compares them (case sensitive), if matches (successful), set zero flag to 1
  +
|-
  +
| match || - || N/A || string, regex &rarr; [empty] || Pops 'string' and 'regex', matches the string against the regex, if matches (successful), set ZF = 1
  +
|-
  +
| jmp || - || label || [empty] &rarr; [empty] || Jumps to the label (unconditional jump)
  +
|-
  +
| jz || - || label || [empty] &rarr; [empty] || Jumps to the label if zero flag is 1
  +
|-
  +
| jnz || - || label || [empty] &rarr; [empty] || Jumps to the label if zero flag is 0
  +
|-
  +
| hlt || - || N/A || || Halts the program
  +
|-
  +
| call || - || label || arg(n),..., arg2, arg1, npar &rarr; [empty] || call a macro (subroutine), see example 6 for details
  +
|-
  +
| ret || - || N/A || PC &rarr; [empty] || Returns from a macro, PC will be placed in stack by call statement, so no need to manually push PC
  +
|-
  +
| nop || - || N/A || || No operation
  +
|}
  +
  +
== Sample compilation of XML code fragments ==
  +
  +
=== Example 1 ===
  +
==== XML t1x Code: chunking ====
  +
<code>
  +
<out>
  +
<chunk name="det_det_nom_adj" case="caseFirstWord">
  +
<tags>
  +
<tag><lit-tag v="SN"/></tag>
  +
<tag><var n="tipus_det"/></tag>
  +
<tag><var n="gen_chunk"/></tag>
  +
<tag><var n="nbr_chunk"/></tag>
  +
</tags>
  +
<lu>
  +
<clip pos="1" side="tl" part="lem"/>
  +
<clip pos="1" side="tl" part="a_det"/>
  +
<clip pos="1" side="tl" part="gen_sense_mf" link-to="3"/>
  +
<clip pos="1" side="tl" part="gen_mf"/>
  +
<clip pos="1" side="tl" part="nbr_sense_sp" link-to="4"/>
  +
<clip pos="1" side="tl" part="nbr_sp"/>
  +
</lu>
  +
<b/>
  +
<lu>
  +
<lit v="el"/>
  +
<lit-tag v="det.def"/>
  +
<clip pos="1" side="tl" part="gen_sense_mf" link-to="3"/>
  +
<lit-tag v="pl"/>
  +
</lu>
  +
<b pos="1"/>
  +
<lu>
  +
<clip pos="3" side="tl" part="lemh"/>
  +
<clip pos="3" side="tl" part="a_nom"/>
  +
<clip pos="3" side="tl" part="gen_sense_mf" link-to="3"/>
  +
<clip pos="3" side="tl" part="gen_mf"/>
  +
<clip pos="3" side="tl" part="nbr_sense_sp" link-to="4"/>
  +
<clip pos="3" side="tl" part="nbr_sp"/>
  +
<clip pos="3" side="tl" part="lemq"/>
  +
</lu>
  +
<b/>
  +
<b pos="2"/>
  +
<lu>
  +
<var n="adjectiu1"/>
  +
<clip pos="2" side="tl" part="lemh"/>
  +
<clip pos="2" side="tl" part="a_adj"/>
  +
<clip pos="2" side="tl" part="gen_sense_mf" link-to="3"/>
  +
<clip pos="2" side="tl" part="gen_mf"/>
  +
<clip pos="2" side="tl" part="nbr_sense_sp" link-to="4"/>
  +
<clip pos="2" side="tl" part="nbr_sp" link-to="4"/>
  +
<clip pos="2" side="tl" part="lemq"/>
  +
</lu>
  +
</chunk>
  +
</out>
  +
</code>
  +
  +
==== Compiled Code ====
  +
  +
<code>
  +
push "det_det_nom_adj"
  +
push "<SN>"
  +
pusht tipus_det ; first evaluate the variable, append/prepend '<>', then push in the stack
  +
pusht gen_chunk
  +
pusht nbr_chunk
  +
  +
push 1
  +
push "^\w+" ; lem
  +
cliptl
  +
push 1
  +
push [regex] ; a_det
  +
cliptl
  +
push "<3>" ; since link-to overrides everything else, we do not need any dedicated instruction
  +
; for that
  +
push 1
  +
push [regex] ; gen_mf
  +
cliptl
  +
push "<4>"
  +
push 1
  +
push [regex] ; nbr_sp
  +
cliptl
  +
lu 6 ; pop 6 items, concat, create lexical unit ^...$ and push back in stack
  +
  +
pushbl ; push a blank
  +
  +
push "el"
  +
push "<det><def>"
  +
push "<3>"
  +
push "<pl>"
  +
lu 4 ; pop 4 items from the stack, create a lexical unit ^...$ and then
  +
; push in the stack
  +
  +
pushsb 1
  +
  +
push 3
  +
push [regex] ; lemh
  +
cliptl
  +
push 3
  +
push [regex] ; a_nom
  +
cliptl
  +
push "<3>"
  +
push 3
  +
push [regex] ; gen_mf
  +
cliptl
  +
push "<4>"
  +
push 3
  +
push [regex] ; nbr_sp
  +
cliptl
  +
push 3
  +
push [regex] ; lemq
  +
cliptl
  +
lu 7
  +
  +
pushbl
  +
pushsb 2
  +
  +
pushv adjectiu1 ; its a var, so eval and push the value
  +
push 3
  +
push [regex] ; lemh
  +
cliptl
  +
push 3
  +
push [regex] ; a_adj
  +
cliptl
  +
push "<3>"
  +
push 3
  +
push [regex] ; gen_mf
  +
cliptl
  +
push "<4>"
  +
push "<4>" ; a bit confused, there are two link-to in the XML
  +
push 3
  +
push [regex] ; lemq
  +
cliptl
  +
lu 7
  +
  +
brace 7 ; no of blank + lexical unit = 7
  +
; pop 7 items, concat, prepend and append {, } then push back
  +
  +
chunk 6 ; create the chunk, ^...{^...$}$, and push back in stack
  +
  +
out 1 ; give output (number of chunks = 1)
  +
</code>
  +
  +
=== Example 2 ===
  +
  +
==== XML t1x Code ====
  +
<code>
  +
<section-def-cats>
  +
<def-cat n="nom">
  +
<cat-item tags="n.*"/>
  +
</def-cat>
  +
  +
<def-cat n="det">
  +
<cat-item tags="det.*"/>
  +
<cat-item tags="predet.*"/>
  +
</def-cat>
  +
</section-def-cats>
  +
  +
<section-rules>
  +
<rule>
  +
<pattern>
  +
<pattern-item n="det"/>
  +
</pattern>
  +
</rule>
  +
<rule>
  +
<pattern>
  +
<pattern-item n="nom"/>
  +
</pattern>
  +
<action/>
  +
</rule>
  +
<rule>
  +
<pattern>
  +
<pattern-item n="det"/>
  +
<pattern-item n="nom"/>
  +
</pattern>
  +
<action/>
  +
</rule>
  +
</section-rules>
  +
</code>
  +
  +
==== Compiled Code ====
  +
  +
<code>
  +
;first rule: def-cat has two equivalent cat-items
  +
push "\w<det>\t" ;load pattern into stack
  +
push 1
  +
addtrie [address1] ;define a trie pattern with value 1 (the first rule)
  +
  +
push "\w<predet>\t" ;same with the second cat-item
  +
push 1
  +
addtrie [address1]
  +
;second rule (and so on) very simple, unique cat-item
  +
push "\w<n>\t"
  +
push 1
  +
addtrie [address2]
  +
;third rule (here is the trick: multiple cat-items in one of the words)
  +
push "\w<det>\t"
  +
push "\w<n>\t"
  +
push 2 ; we have 'det' followed by a 'nom', so addtrie has to pop two elements
  +
addtrie [address3]
  +
  +
push "\w<predet>\t"
  +
push "\w<n>\t"
  +
push 2
  +
addtrie [address3]
  +
</code>
  +
  +
=== Example 3 ===
  +
==== XML t1x Code ====
  +
  +
<code>
  +
<def-macro n="f_coma" npar="1">
  +
<choose>
  +
<when>
  +
<test>
  +
<equal caseless="yes">
  +
<clip pos="1" side="sl" part="lem"/>
  +
<lit v="como"/>
  +
</equal>
  +
</test>
  +
<let>
  +
<clip pos="1" side="tl" part="lem"/>
  +
<get-case-from pos="1">
  +
<lit v="com a"/>
  +
</get-case-from>
  +
</let>
  +
</when>
  +
</choose>
  +
</def-macro>
  +
</code>
  +
  +
==== Compiled code ====
  +
  +
<code>
  +
f_coma: push 1 ; "pos" of "clip"
  +
push "^\w+" ; "lem"
  +
clipsl ; gets the value clips on the top of the stack.
  +
; "sl" side is implied in the name of the instruction
  +
push "como"
  +
cmpi ; does the comparison and cleans the stack, it means caseless
  +
jnz end ; if the comparison does not succeeds, go to end
  +
; semantics: j = jump n = not z = zero flag is activated
  +
; zero flag is activated when a comparison succeeds
  +
; or an arithmetical operation gives 0
  +
push 1 ; "pos" of "clip"
  +
push "^\w+"
  +
push "com a"
  +
storetl ; store the value provided in the top of the stack
  +
; given position 1, "tl" side and "lem"
  +
  +
end: ...
  +
</code>
  +
  +
=== Example 4 ===
  +
==== XML t1x Code ====
  +
<code>
  +
<test>
  +
<or>
  +
<not>
  +
<equal>
  +
<clip pos="1" side="sl" part="gen"/>
  +
<clip pos="3" side="sl" part="gen"/>
  +
</equal>
  +
</not>
  +
<not>
  +
<equal>
  +
<clip pos="2" side="sl" part="gen"/>
  +
<clip pos="3" side="sl" part="gen"/>
  +
</equal>
  +
</not>
  +
</or>
  +
</test>
  +
</code>
  +
==== Compiled code ====
  +
  +
<code>
  +
start: push 1
  +
push [regex] ; part="gen"
  +
clipsl
  +
push 3
  +
push [regex] ; part="gen"
  +
clipsl
  +
cmp ; compare (case sensitive)
  +
pushnz ; NOT zero flag and push in stack
  +
  +
push 2
  +
push [regex] ; part="gen"
  +
clipsl
  +
push 3
  +
push [regex] ; part="gen"
  +
clipsl
  +
cmp ; compare (case sensitive)
  +
pushnz
  +
  +
or ; pop 2 items and OR, push result in stack
  +
jnz end ; jump if zero flag is 0 (we did not get ZERO as the result)
  +
  +
... ... ...
  +
(code for successful test)
  +
... ... ...
  +
end: ...
  +
</code>
  +
  +
=== Example 5 ===
  +
==== XML t1x Code ====
  +
<def-list n="verbos_est">
  +
<list-item v="actuar"/>
  +
<list-item v="buscar"/>
  +
<list-item v="estudiar"/>
  +
<list-item v="existir"/>
  +
<list-item v="ingressar"/>
  +
<list-item v="introduir"/>
  +
<list-item v="penetrar"/>
  +
<list-item v="publicar"/>
  +
<list-item v="treballar"/>
  +
<list-item v="viure"/>
  +
</def-list>
  +
  +
<rule>
  +
<pattern>
  +
<pattern-item n="verb"/>
  +
<pattern-item n="a"/>
  +
</pattern>
  +
<action>
  +
<choose>
  +
<when>
  +
<test>
  +
<in caseless="yes"/>
  +
<clip pos="1" side="sl" part="lem"/>
  +
<list n="verbos_est"/>
  +
</in>
  +
</test>
  +
<let>
  +
<clip pos="2" side="tl" part="lem"/>
  +
<lit v="en"/>
  +
</let>
  +
</when>
  +
</choose>
  +
</rule>
  +
  +
==== Compiled code ====
  +
push "actuar"
  +
push "buscar"
  +
push "estudiar"
  +
push "existir"
  +
push "ingressar"
  +
push "introduir"
  +
push "penetrar"
  +
push "publicar"
  +
push "treballar"
  +
push "viure"
  +
push 10 ; number of elements in the list
  +
mklist verbos_est ; make a list variable named 'verbos_est' and put the last 10 data
  +
; from the stack in the list
  +
  +
rule1: push [regex_verb]
  +
push [regex_a]
  +
push 2
  +
addtrie rule1_action
  +
... ... ...
  +
... ... ...
  +
  +
rule1_action: push 1
  +
push "^\w+" ; lem
  +
clipsl ; we have lemmma in stack now
  +
incini verbox_est ; if in verbos_est (ignore case), set ZF = 1, else ZF = 0
  +
jnz rule1_end
  +
  +
push 2
  +
push "^\w+"
  +
push "en"
  +
storetl
  +
rule1_end: ...
  +
  +
=== Example 6 ===
  +
==== XML t1x Code ====
  +
<def-macro n="firstWord" npar="1">
  +
<choose>
  +
<when>
  +
<test>
  +
<equal>
  +
<clip pos="1" side="sl" part="a_np_acr"/>
  +
<lit v=""/>
  +
</equal>
  +
</test>
  +
<choose>
  +
<when>
  +
<test>
  +
<equal>
  +
<var n="EOS"/>
  +
<lit v="true"/>
  +
</equal>
  +
</test>
  +
<modify-case>
  +
<clip pos="1" side="tl" part="lem"/>
  +
<lit v="aa"/>
  +
</modify-case>
  +
<let>
  +
<var n="caseFirstWord"/>
  +
<lit v="Aa"/>
  +
</let>
  +
</when>
  +
<otherwise>
  +
<let>
  +
<var n="caseFirstWord"/>
  +
<lit v="aa"/>
  +
</let>
  +
</otherwise>
  +
</choose>
  +
</when>
  +
<otherwise>
  +
<let>
  +
<var n="caseFirstWord"/>
  +
<lit v="aa"/>
  +
</let>
  +
</otherwise>
  +
</choose>
  +
<let>
  +
<var n="EOS"/>
  +
<lit v="false"/>
  +
</let>
  +
</def-macro>
  +
  +
  +
<rule comment="REGLA: DET DET ADJ NOM (your many beautiful cats)">
  +
... ...
  +
<action>
  +
<call-macro n="firstWord">
  +
<with-param pos="1"/>
  +
</call-macro>
  +
<call-macro n="f_concord4">
  +
<with-param pos="4"/>
  +
<with-param pos="3"/>
  +
<with-param pos="2"/>
  +
<with-param pos="1"/>
  +
</call-macro>
  +
...
  +
<out>
  +
<chunk name="det_det_nom_adj" case="caseFirstWord">
  +
... ...
  +
</chunk>
  +
</out>
  +
</action>
  +
</rule>
  +
  +
==== Compiled code ====
  +
  +
firstWord:
  +
... ... ; normal translation of instructions, all the variables are assumed global
  +
... ...
  +
ret ; ret instruction does a number of things
  +
; pops 'frame stack', current 'local variable frame' is reset with popped
  +
; values (actually its more pointer assignment), C++ version will also
  +
; do the necessary deallocations
  +
; pops global stack, update PC with the popped value
  +
... ...
  +
... ...
  +
rule_ddan_action: push 1 ; pos = 1
  +
push 1 ; number of parameters 1
  +
call firstWord ; macro label
  +
; call statement does a number of things
  +
; 1. temppc = PC + 1, set PC = firstWord
  +
; 2. pushes the current 'local variable frame' into 'frame stack'
  +
; 3. create a new 'local variable frame'
  +
; 4. pops the arguments from the stack and places then in the 'local
  +
; variable frame'
  +
; 5. pushes temppc in global stack (it will be used by the return
  +
; statement)
  +
  +
; 6. continue (instruction at firstWord will be evaluated next)
  +
push 1 ; notice that the arguments are pushed in reverse order
  +
; when popped, they will be in the right order
  +
push 2
  +
push 3
  +
push 4
  +
push 4
  +
call f_concord4
  +
... ...
  +
  +
== Development Notes ==
  +
  +
* None of the macro and actions need to return anything (unlike conventional functions), so provision for returning a value (using stack) is unnecessary
  +
  +
* The local variable frame is actually a queue with a maximum length equal to the maximum pattern length in the trie.

Revision as of 13:17, 12 July 2010

Instruction Sets

Mnemonic Opcode
(in hex)
Other operands Stack
[before]→[after]
Description
push - value [empty] → value Pushes a value in stack
pushv - var [empty] → value Evaluates the var and pushes its value in stack
pusht - var [empty] → <value> Evaluates the var and pushes its value as a tag in stack
pushbl - N/A [empty] → blank pushes a blank in the stack
pushsb - pos [empty] → superblank pushes the superblank at 'pos' in stack
pushz - N/A [empty] → [zero_flag] pushes the current value of zero_flag in stack
pushnz - N/A [empty] → [not_zero_flag] first takes the NOT of the current value of zero_flag, then pushes the value in stack
cliptl - N/A pos, regex → value Matches 'regex' in target language 'pos' and pushes the value in stack
clipsl - N/A pos, regex → value Matches 'regex' in source language 'pos' and pushes the value in stack
storetl - N/A pos, regex, data → value Replace 'regex' in source language 'pos' with 'data'
addtrie - address pattern, pattern, ..., no_of_patterns → [empty] Pops 'no_of_pattern' amount of data from the stack, combine these patterns, add that to the trie pointing to given 'address'
lu - num lemma, tag1, ..., tagn → ^(lexical_unit)$ Pops 'num' amount of data from the stack and creates a lexical unit ^... ...$ with them, pushes the lu back in the stack
brace - num lu1, blank1, lu2, blank2, ..., lun → {... ...} Pops 'num' amount of data from the stack and creates the braced version {... ... ...}, pushes it back in the stack
chunk - num chunk_name, tag1, tag2, ... , {^... ...$} → ^chunk_name<tag1>...<tagn>{^... ...$}$ Pops 'num' amount of data from the stack and creates the chunk, pushes back in the stack
out - num chunk1, chunk2, ... → [empty] Pops 'num' amount of data from the stack and puts then in standard output
cmpi - N/A data1, data2 → [empty] Pops data1 and data2, string compares them (ignorecase), if matches (successful), set zero flag to 1 (it means we have a zero)
cmp - N/A data1, data2 → [empty] Pops data1 and data2, string compares them (case sensitive), if matches (successful), set zero flag to 1
match - N/A string, regex → [empty] Pops 'string' and 'regex', matches the string against the regex, if matches (successful), set ZF = 1
jmp - label [empty] → [empty] Jumps to the label (unconditional jump)
jz - label [empty] → [empty] Jumps to the label if zero flag is 1
jnz - label [empty] → [empty] Jumps to the label if zero flag is 0
hlt - N/A Halts the program
call - label arg(n),..., arg2, arg1, npar → [empty] call a macro (subroutine), see example 6 for details
ret - N/A PC → [empty] Returns from a macro, PC will be placed in stack by call statement, so no need to manually push PC
nop - N/A No operation

Sample compilation of XML code fragments

Example 1

XML t1x Code: chunking

<out>
  <chunk name="det_det_nom_adj" case="caseFirstWord">
    <tags>
      <tag><lit-tag v="SN"/></tag>
      <tag></tag>
      <tag></tag>
      <tag></tag>
    </tags>
    <lu>
      <clip pos="1" side="tl" part="lem"/>
      <clip pos="1" side="tl" part="a_det"/>
      <clip pos="1" side="tl" part="gen_sense_mf" link-to="3"/>
      <clip pos="1" side="tl" part="gen_mf"/>
      <clip pos="1" side="tl" part="nbr_sense_sp" link-to="4"/>
      <clip pos="1" side="tl" part="nbr_sp"/>
    </lu>
    
    <lu>
      <lit v="el"/>
      <lit-tag v="det.def"/>
      <clip pos="1" side="tl" part="gen_sense_mf" link-to="3"/>
      <lit-tag v="pl"/>
    </lu>
    
    <lu>
      <clip pos="3" side="tl" part="lemh"/>
      <clip pos="3" side="tl" part="a_nom"/>
      <clip pos="3" side="tl" part="gen_sense_mf" link-to="3"/>
      <clip pos="3" side="tl" part="gen_mf"/>
      <clip pos="3" side="tl" part="nbr_sense_sp" link-to="4"/>
      <clip pos="3" side="tl" part="nbr_sp"/>
      <clip pos="3" side="tl" part="lemq"/>
    </lu>
    
    
    <lu>
      
      <clip pos="2" side="tl" part="lemh"/>
      <clip pos="2" side="tl" part="a_adj"/>
      <clip pos="2" side="tl" part="gen_sense_mf" link-to="3"/>
      <clip pos="2" side="tl" part="gen_mf"/>
      <clip pos="2" side="tl" part="nbr_sense_sp" link-to="4"/>
      <clip pos="2" side="tl" part="nbr_sp" link-to="4"/>
      <clip pos="2" side="tl" part="lemq"/>
    </lu>
  </chunk>
</out>

Compiled Code

push    "det_det_nom_adj"
push    "<SN>"
pusht   tipus_det          ; first evaluate the variable, append/prepend '<>', then push in the stack
pusht   gen_chunk
pusht   nbr_chunk

push    1
push    "^\w+"             ; lem
cliptl
push    1
push    [regex]            ; a_det
cliptl
push    "<3>"              ; since link-to overrides everything else, we do not need any dedicated instruction
                           ; for that
push    1
push    [regex]            ; gen_mf
cliptl
push    "<4>"
push    1
push    [regex]            ; nbr_sp
cliptl
lu      6                  ; pop 6 items, concat, create lexical unit ^...$ and push back in stack

pushbl                     ; push a blank

push    "el"
push    "<det><def>"
push    "<3>"
push    "<pl>"
lu      4                  ; pop 4 items from the stack, create a lexical unit ^...$ and then
                           ; push in the stack

pushsb 1

push   3
push   [regex]             ; lemh
cliptl
push   3
push   [regex]             ; a_nom
cliptl
push   "<3>"
push   3
push   [regex]             ; gen_mf
cliptl
push   "<4>"
push   3
push   [regex]             ; nbr_sp
cliptl
push   3
push   [regex]             ; lemq
cliptl
lu     7

pushbl
pushsb 2

pushv  adjectiu1           ; its a var, so eval and push the value
push   3
push   [regex]             ; lemh
cliptl
push   3
push   [regex]             ; a_adj
cliptl
push   "<3>"
push   3
push   [regex]             ; gen_mf
cliptl
push   "<4>"
push   "<4>"               ; a bit confused, there are two link-to in the XML
push   3
push   [regex]             ; lemq
cliptl
lu     7

brace  7                   ; no of blank + lexical unit = 7
                           ; pop 7 items, concat, prepend and append {, } then push back

chunk  6                   ; create the chunk, ^...{^...$}$, and push back in stack

out    1                   ; give output (number of chunks = 1)

Example 2

XML t1x Code

<section-def-cats>
  <def-cat n="nom">
    <cat-item tags="n.*"/>
  </def-cat>
 
  <def-cat n="det">
    <cat-item tags="det.*"/>
    <cat-item tags="predet.*"/>
  </def-cat>
</section-def-cats>

<section-rules>
  <rule>
    <pattern>
      <pattern-item n="det"/>
    </pattern>
  </rule>
  <rule>
    <pattern>
      <pattern-item n="nom"/>
    </pattern>
    <action/>
  </rule>
  <rule>
    <pattern>
      <pattern-item n="det"/>
      <pattern-item n="nom"/>
    </pattern>
  <action/>
  </rule>
</section-rules>

Compiled Code

                         ;first rule: def-cat has two equivalent cat-items
push      "\w<det>\t"    ;load pattern into stack
push      1
addtrie   [address1]     ;define a trie pattern with value 1 (the first rule)

push      "\w<predet>\t" ;same with the second cat-item
push      1
addtrie   [address1]
                         ;second rule (and so on) very simple, unique cat-item
push      "\w<n>\t"
push      1
addtrie   [address2]
                         ;third rule (here is the trick: multiple cat-items in one of the words)
push      "\w<det>\t"
push      "\w<n>\t"
push      2              ; we have 'det' followed by a 'nom', so addtrie has to pop two elements
addtrie   [address3]

push      "\w<predet>\t"
push      "\w<n>\t"
push      2
addtrie   [address3]

Example 3

XML t1x Code

<def-macro n="f_coma" npar="1">
  <choose>
    <when>
      <test>
        <equal caseless="yes">
          <clip pos="1" side="sl" part="lem"/>
          <lit v="como"/>
        </equal>
      </test>
      <let>
        <clip pos="1" side="tl" part="lem"/>
        <get-case-from pos="1">
          <lit v="com a"/>
        </get-case-from>
      </let>
    </when>
  </choose>
</def-macro>

Compiled code

f_coma:  push      1        ; "pos" of "clip"
         push      "^\w+"   ; "lem"
         clipsl             ; gets the value clips on the top of the stack.
                            ; "sl" side is implied in the name of the instruction
         push      "como"
         cmpi               ; does the comparison and cleans the stack, it means caseless
         jnz       end      ; if the comparison does not succeeds, go to end
                            ; semantics: j = jump n = not z = zero flag is activated
                            ; zero flag is activated when a comparison succeeds
                            ; or an arithmetical operation gives 0
         push      1        ; "pos" of "clip"
         push      "^\w+"
         push      "com a"
         storetl            ; store the value provided in the top of the stack
                            ; given position 1, "tl" side and "lem"

end:     ...

Example 4

XML t1x Code

<test>
 <or>
   <not>
     <equal>
       <clip pos="1" side="sl" part="gen"/>
       <clip pos="3" side="sl" part="gen"/>
     </equal>
   </not>
   <not>
     <equal>
       <clip pos="2" side="sl" part="gen"/>
       <clip pos="3" side="sl" part="gen"/>
     </equal>
   </not>
 </or>
</test>

Compiled code

start:      push       1
            push       [regex]           ; part="gen"
            clipsl
            push       3
            push       [regex]           ; part="gen"
            clipsl
            cmp                          ; compare (case sensitive)
            pushnz                       ; NOT zero flag and push in stack
            
            push       2
            push       [regex]           ; part="gen"
            clipsl
            push       3
            push       [regex]           ; part="gen"
            clipsl
            cmp                          ; compare (case sensitive)
            pushnz
            
            or                           ; pop 2 items and OR, push result in stack
            jnz        end               ; jump if zero flag is 0 (we did not get ZERO as the result)
                        
            ... ... ...
            (code for successful test)
            ... ... ...
end:        ...

Example 5

XML t1x Code

<def-list n="verbos_est">
  <list-item v="actuar"/>
  <list-item v="buscar"/>
  <list-item v="estudiar"/>
  <list-item v="existir"/>
  <list-item v="ingressar"/>
  <list-item v="introduir"/>
  <list-item v="penetrar"/>
  <list-item v="publicar"/>
  <list-item v="treballar"/>
  <list-item v="viure"/>
</def-list>

<rule>
  <pattern>
    <pattern-item n="verb"/>
    <pattern-item n="a"/>
  </pattern>
  <action>
  <choose>
    <when>
       <test>
         <in caseless="yes"/>
           <clip pos="1" side="sl" part="lem"/>
           <list n="verbos_est"/>
         </in>
       </test>
       <let>
         <clip pos="2" side="tl" part="lem"/>
         <lit v="en"/>
       </let>
    </when>
  </choose>
</rule>

Compiled code

                push       "actuar"
                push       "buscar"
                push       "estudiar"
                push       "existir"
                push       "ingressar"
                push       "introduir"
                push       "penetrar"
                push       "publicar"
                push       "treballar"
                push       "viure"
                push       10                ; number of elements in the list
                mklist     verbos_est        ; make a list variable named 'verbos_est' and put the last 10 data
                                             ; from the stack in the list
                                         
rule1:          push      [regex_verb]
                push      [regex_a]
                push      2
                addtrie   rule1_action
                ... ... ...
                ... ... ...
                         
rule1_action:   push      1
                push      "^\w+"            ; lem
                clipsl                      ; we have lemmma in stack now
                incini    verbox_est        ; if in verbos_est (ignore case), set ZF = 1, else ZF = 0       
                jnz       rule1_end

                push      2
                push      "^\w+"
                push      "en"
                storetl
rule1_end:      ...

Example 6

XML t1x Code

<def-macro n="firstWord" npar="1">
  <choose>
    <when>
      <test>
          <equal>
        <clip pos="1" side="sl" part="a_np_acr"/>
        <lit v=""/>
          </equal>
      </test>
      <choose>
        <when>
          <test>
        <equal> 
          
          <lit v="true"/>
        </equal> 
          </test>
          <modify-case>
        <clip pos="1" side="tl" part="lem"/>
        <lit v="aa"/>
          </modify-case>
          <let>
        
        <lit v="Aa"/>
          </let>
        </when>
        <otherwise> 
          <let>
        
        <lit v="aa"/>
          </let>
        </otherwise>
      </choose>
    </when>
    <otherwise>
      <let>
        
        <lit v="aa"/>
      </let>
    </otherwise>
      </choose>
      <let>
    
    <lit v="false"/>
  </let>
</def-macro>


<rule comment="REGLA: DET DET ADJ NOM (your many beautiful cats)">
  ... ...
  <action>
    <call-macro n="firstWord">
      <with-param pos="1"/>
    </call-macro>
    <call-macro n="f_concord4">
      <with-param pos="4"/>
      <with-param pos="3"/>
      <with-param pos="2"/>
      <with-param pos="1"/>
    </call-macro>
    ...
    <out> 
      <chunk name="det_det_nom_adj" case="caseFirstWord">
      ... ...
      </chunk>
    </out>
 </action>
</rule>

Compiled code

firstWord:         
                   ... ...              ; normal translation of instructions, all the variables are assumed global
                   ... ...
                   ret                  ; ret instruction does a number of things
                                        ; pops 'frame stack', current 'local variable frame' is reset with popped
                                        ; values (actually its more pointer assignment), C++ version will also
                                        ; do the necessary deallocations
                                        ; pops global stack, update PC with the popped value
                   ... ...
                   ... ...
rule_ddan_action:  push     1           ; pos = 1
                   push     1           ; number of parameters 1
                   call     firstWord   ; macro label
                                        ; call statement does a number of things
                                        ;    1. temppc = PC + 1, set PC = firstWord
                                        ;    2. pushes the current 'local variable frame' into 'frame stack'
                                        ;    3. create a new 'local variable frame'
                                        ;    4. pops the arguments from the stack and places then in the 'local
                                        ;       variable frame'
                                        ;    5. pushes temppc in global stack (it will be used by the return
                                        ;       statement)

                                        ;    6. continue (instruction at firstWord will be evaluated next)
                   push     1           ; notice that the arguments are pushed in reverse order
                                        ; when popped, they will be in the right order
                   push     2
                   push     3
                   push     4
                   push     4
                   call     f_concord4
                   ... ...

Development Notes

  • None of the macro and actions need to return anything (unlike conventional functions), so provision for returning a value (using stack) is unnecessary
  • The local variable frame is actually a queue with a maximum length equal to the maximum pattern length in the trie.