Difference between revisions of "VM for transfer"
Jump to navigation
Jump to search
Line 89: | Line 89: | ||
== Code generation == |
== Code generation == |
||
=== Code sections === |
|||
The code generated by the compiler is divided in these sections: |
|||
{| class="wikitable" border="1" |
|||
! Section !! Code !! Information |
|||
|- |
|||
| Header || <code>#<assembly> <br /> #<transfer default="chunk"></code> || This section establishes the type of code generated and the transfer stage. |
|||
|- |
|||
| Initialization || <code>push "genere"<br /> push "<m>"<br /> storev <br /> ... <br /> jmp rules_section_start</code> || In this section we initialize the variables with their default value and execute other initialization code. <br /> At the end we jmp to the section rules section, although rules will only execute when a pattern is matched,<br /> we need to process all the patterns which are in the rules section. |
|||
|- |
|||
| Macros || <code>macro_firstWord_start: <br /> ... <br /> macro_firstWord_end: <br />...</code> || This section contains all the macro's code delimited by labels.<br /> Each macro can be called with the 'call' instruction. |
|||
|- |
|||
| Patterns || <code>section_rules_start:<br /> patterns_start:<br /> push "all<predet><sp>"<br /> push "<n><pl>"<br /> push 2<br /> addtrie action_0_start<br /> ... <br />patterns_end:</code> || In this section all the patterns will be added to the system trie. <br />In this example you can see that two patterns are pushed, then the number of patterns is pushed and finally<br /> the addtrie instruction pops them and adds an entry in the trie to the rule 0. |
|||
|- |
|||
| Rules || <code>action_0_start:<br /> ...<br /> action_0_end:<br /> ...<br /> section_rules_end:</code> || Finally the rules section contains every rule delimited by its labels and all its code. |
|||
|} |
|||
*One line comments can be made by using the '#' symbol at the start of the line. |
|||
=== Code examples === |
Revision as of 10:04, 12 July 2011
Instruction Set
Mnemonic | Opcode (in hex) |
Other operands | Stack [before]→[after] (top, top-1, ...) |
Description |
---|---|---|---|---|
push | - | value | [empty] → value | Pushes a string or a variable value onto the stack. Strings go between quotes ("string") but variable's names not |
pushbl | - | N/A | [empty] → blank | Pushes a blank onto the stack |
pushsb | - | pos | [empty] → superblank | Pushes the superblank at 'pos' onto the stack |
append | - | N | valueN, ..., value1, varName → [empty] | Pops 'N' elements and appends them to a variable or clip |
concat | - | N | valueN, ..., value1 → value1...valueN | Pops 'N' elements and pushes them back concatenated |
clip | - | N/A | part → value | Obtains the part in the only language there is (inter/post-chunk) and pushes the value onto the stack |
clipsl | - | N/A | part, pos → value | Obtains the 'part' in source language in position 'pos' and pushes the 'value' onto the stack |
cliptl | - | N/A | part, pos → value | Obtains the 'part' in target language in position 'pos' and pushes the 'value' onto the stack |
storecl | - | N/A | value, part → [empty] | Stores 'value' in the only language there is (inter/post-chunk) |
storesl | - | N/A | value, part, pos → [empty] | Stores 'value' as the 'part' of the source language in position 'pos' |
storetl | - | N/A | value, part, pos → [empty] | Stores 'value' as the 'part' of the target language in position 'pos' |
storev | - | N/A | value, varName → [empty] | Stores 'value' in the variable with name 'varName' |
addtrie | - | address | N, patternN, ..., pattern1 → [empty] | Pops 'N' patterns and creates a trie entry pointing to 'address' |
lu | - | N | valueN, ..., value1 → ^(lexical_unit)$ | Pops 'N' values from the stack, creates a lexical unit ^...$ with them and pushes the lu back onto the stack |
mlu | - | N | luN, ..., lu1 → multiword | Pops 'N' lu from the stack, creates a multiword with them and pushes the multiword back onto the stack |
lu-count | - | N/A | [empty] → number | Pushes the number of lexical units (words inside the chunk) in the rule onto the stack |
chunk | - | N | elemN-2, ... , elem1, <tags>, name → ^name<tags>{elem1...elemN-2}$ | Pops 'N' amount of data from the stack, creates the chunk and pushes it back onto the stack |
out | - | N | valueN, ..., value1 → [empty] | Pops 'N' values from the stack and outputs them |
cmp | - | N/A | value2, value1 → result | Pops 'value1' and 'value2', compares them, if they are equal pushes a 1 (true), if they aren't pushes a 0 (false) |
cmpi | - | N/A | value2, value1 → result | Pops 'value1' and 'value2', compares them (ignoring case for each string), if they are equal pushes a 1 (true), if they aren't pushes a 0 (false) |
cmp-substr | - | N/A | value2, value1 → result | Tests if 'value1' contains the substring 'value2', result can be 1 (true) or 0 (false). |
cmpi-substr | - | N/A | value2, value1 → result | Tests if 'value1' contains the substring 'value2' (ignoring case for each string), result can be 1 (true) or 0 (false). |
not | - | N | value → result | Negates the value on top of the stack, 0 -> 1 or 1 -> 0 |
and | - | N | valueN, ..., value1 → result | And operation of 'N' values, result can be 1 (true) or 0 (false) |
or | - | N | valueN, ..., value1 → result | Or operation of 'N' values, result can be 1 (true) or 0 (false) |
in | - | N/A | list, value → result | Performs a search of a 'value' in a 'list' |
inig | - | N/A | list, value → result | Performs a search (ignoring case) of a 'value' in a 'list' |
jmp | - | label | [empty] → [empty] | Jumps to the label, unconditionally |
jz | - | label | top → [empty] | Jumps to the label if stack.top == 0 |
jnz | - | label | top → [empty] | Jumps to the label if stack.top == 1 |
call | - | label | N, argN, ..., arg1 → [empty] | Calls a macro with the arguments on the stack |
ret | - | N/A | [empty] → [empty] | Returns from a macro, PC will be handled automatically by the VM. |
nop | - | N/A | [empty] → [empty] | No operation |
case-of | - | N/A | container → case | Gets the case from the container in the stack. The container would usually be the result of a clip instruction but can be any string. |
get-case-from | - | N/A | pos → case | Gets the case from the lexical unit in position 'pos' |
modify-case | - | N/A | case, container → modifiedContainer | Modifies the case of the 'container' to 'case' and leaves the modified container on the stack |
begins-with | - | N/A | value2, value1 → result | Checks if 'value1' begins with 'value2' and pushes 1 (true) or 0 (false), 'value2' can be a list |
begins-with-ig | - | N/A | value2, value1 → result | Checks if 'value1' begins with 'value2' (ignoring the case) and pushes 1 (true) or 0 (false), 'value2' can be a list |
ends-with | - | N/A | value2, value1 → result | Checks if 'value1' ends with 'value2' and pushes 1 (true) or 0 (false), 'value2' can be a list |
ends-with-ig | - | N/A | value2, value1 → result | Checks if 'value1' ends with 'value2' (ignoring the case) and pushes 1 (true) or 0 (false), 'value2' can be a list |
- Lists are represented as a concatenation of items separated by '|', e.g. uno|otro|poco|cuánto|menos|mucho|tanto|demasiado
- The case is represented as "aa" (all lowercase), "Aa" (first uppercase) and "AA", (all uppercase).
Code generation
Code sections
The code generated by the compiler is divided in these sections:
Section | Code | Information |
---|---|---|
Header | #<assembly> |
This section establishes the type of code generated and the transfer stage. |
Initialization | push "genere" |
In this section we initialize the variables with their default value and execute other initialization code. At the end we jmp to the section rules section, although rules will only execute when a pattern is matched, we need to process all the patterns which are in the rules section. |
Macros | macro_firstWord_start: |
This section contains all the macro's code delimited by labels. Each macro can be called with the 'call' instruction. |
Patterns | section_rules_start: |
In this section all the patterns will be added to the system trie. In this example you can see that two patterns are pushed, then the number of patterns is pushed and finally the addtrie instruction pops them and adds an entry in the trie to the rule 0. |
Rules | action_0_start: |
Finally the rules section contains every rule delimited by its labels and all its code. |
- One line comments can be made by using the '#' symbol at the start of the line.