Difference between revisions of "User:Mlforcada/intermediatelanguagefortransfer"

From Apertium
Jump to navigation Jump to search
(Created page with "The first 2 characters of the file are the length of the longest pattern and the number of rules. {| class="wikitable" border="1" |- ! Code ! Name ! Action |- | R [int] | rul...")
 
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
  +
 
The first 2 characters of the file are the length of the longest pattern and the number of rules.
 
The first 2 characters of the file are the length of the longest pattern and the number of rules.
  +
Trying to generate a complete transfer machine.
  +
Apparently the stack can hold Booleans, integers, strings (also tags?), superblanks (not strings) and chunks. Types should be more carefully discussed.
  +
It can also hold strings that represent global variables of different types such as lists (this needs to be carefully specified).
  +
Heavily inspired by other stack languages such as FORTH.
   
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
  +
! Mnemonic
! Code
 
! Name
 
 
! Action
 
! Action
 
|-
 
|-
| R [int]
+
| rule [int]
| rule
 
 
| marks the start of a new rule composed of the next [int] characters
 
| marks the start of a new rule composed of the next [int] characters
 
|-
 
|-
| s [int]
+
| drop
  +
| drop the top of the stack
| string
 
| pushes the next [int] characters onto the stack as a literal string
 
 
|-
 
|-
| j [int]
+
| dup
  +
| duplicate the top of the stack
| jump
 
| increments the instruction pointer by [int]
 
 
|-
 
|-
| ? [int]
+
| over
  +
| push on top of the stack a copy of the element just below the top of the stack
| jump if not
 
  +
|-
  +
|-
  +
| swap
  +
| swap the two topmost elements in the stack
  +
|-
  +
| "[string]"
  +
| pushes the next [int] characters onto the stack as a literal; it can be the name of a var, clip [...] . Stored as a single byte opcode, length, and number of bytes.
  +
|-
  +
| [int]
  +
| pushes the integer onto the stack, stored as as a single-byte opcode followed by a fixed number of bytes
  +
|-
  +
| False
  +
| pushes False onto the stack
  +
|-
  +
| True
  +
| pushes True onto the stack
  +
|-
  +
| jump [int]
  +
| increments the instruction pointer by [int] (stored as a single byte opcode and a fixed number of bytes for the instruction pointer)
  +
|-
  +
| jumponfalse
 
| pops a bool off the stack, increments instruction pointer by [int] if its false
 
| pops a bool off the stack, increments instruction pointer by [int] if its false
 
|-
 
|-
| & [int]
 
 
| and
 
| and
| pops [int] bools of the stack and pushes whether all of them are true
+
| pops two booleans off the stack and pushes whether all of them are true
 
|-
 
|-
  +
| or
| <code>| [int]</code>
 
  +
| pops two Boolean off the stack and pushes whether any of them are true
| or
 
| pops [int] bools of the stack and pushes whether any of them are true
 
 
|-
 
|-
| !
 
 
| not
 
| not
 
| logically negates top of stack
 
| logically negates top of stack
 
|-
 
|-
| = / =#
 
 
| equal
 
| equal
| push whether the first two strings popped are the same (=# ignores case)
+
| pop two strings off the stack and push True if they are equal and False otherwise
 
|-
 
|-
  +
| equalfold
| ( / (#
 
  +
| pop two strings off the stack and push True if they are equal (regardless of case) and False otherwise
| begins with
 
| push whether the first string popped occurs at the beginning of the second (<code>(#</code> ignores case )
 
 
|-
 
|-
  +
| isprefix
| ) / )#
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a prefix of the second
| ends with
 
| push whether the first string popped occurs at the end of the second (<code>(#</code> ignores case )
 
 
|-
 
|-
  +
| isprefixfold
| [ / [#
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a prefix of the second (regardless of case)
| begins with list
 
| push whether the second string popped begins with any member of the list named by the first string popped ([# ignores case)
 
 
|-
 
|-
  +
| issuffix
| ] / ]#
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a suffix of the second
| ends with list
 
| push whether the second string popped ends with any member of the list named by the first string popped (]# ignores case)
 
 
|-
 
|-
  +
| issuffixfold
| c / c#
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a suffix of the second (regardless of case)
| contains
 
| push whether the first string popped appears anywhere in the second (c# ignores case)
 
 
|-
 
|-
  +
| issubstring
| n / n#
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a substring of the second
| in
 
| push whether the second string popped is a member of the list named by the first (n# ignores case)
 
 
|-
 
|-
  +
| issubstringfold
| >
 
  +
| pop two strings off the stack and push a Boolean whether the first string popped is a substring of the second (regardless of case)
| begin let
 
| indicates that the next clip or var statement should not be evaluated
 
 
|-
 
|-
  +
| hasprefix
| * / *#
 
  +
| push whether a prefix of the second string popped is in the list named by the first string popped
| end let clip
 
| pops a value and an unevaluated clip and sets the clip to the value (*# copies the case of the value to the clip)
 
 
|-
 
|-
  +
| hasprefixfold
| 4 / 4#
 
  +
| push whether a prefix of the second string popped is in the list named by the first string popped (regardless of case)
| end let var
 
| pops a value and a variable name and sets the variable to the value (4# copies the case of the value to the variable)
 
 
|-
 
|-
  +
| hassuffix
| < [int]
 
  +
| push whether a suffix of the second string popped is in the list named by the first string popped
| out
 
| pops [int] chunks off the stack and appends them to the output queue in the order that they were pushed (in recursive mode, the output queue is later passed back to the rule applier)
 
 
|-
 
|-
  +
| hassuffixfold
| . [int]
 
  +
| push whether a suffix of the second string popped is in the list named by the first string popped (regardless of case)
| clip
 
| if preceded by >, pushes [int] onto the stack, otherwise pops a string off the stack and retrieves that property of the position indicated by [int]
 
 
|-
 
|-
| $
+
| in
  +
| push whether the second string popped is a member of the list named by the first string
| var
 
  +
|-
| if preceded by >, do nothing, otherwise pops a string off the stack and pushes the value of the variable with that name
 
  +
| infold
  +
| push whether the second string popped is a member of the list named by the first string
  +
|-
  +
| fetch
  +
| replaces var or clip name on top of the stack with the value contained in the clip or var named.
  +
|-
  +
| store
  +
| stores the value of the string below the top of the stack onto the var or clip named by the top of the stack, and pops both values
  +
|-
  +
| printchunk
  +
| pops one chunk off the stack and appends it to the output queue (in recursive mode, the output queue is later passed back to the rule applier)
  +
|-
  +
| printlf
  +
| prints one lexical form off the stack and appends it to the output queue (...)
  +
|-
  +
| property
  +
| pops one string off the stack and then an integer n off the stack, and stores the name of the n-th property [???]
 
|-
 
|-
| G
+
| getcase
  +
| pops a string off the stack, pushes string "AA", "Aa", or "aa" depending on its case
| get case
 
| pops a string off the stack, pushes "AA", "Aa", or "aa" depending on its case
 
 
|-
 
|-
  +
| applycase
| A
 
  +
| pops one string off the stack, and modifies the top of stack according to the case pattern of the string popped
| copy case
 
| pops a string off the stack, copies its cases onto the next string on the stack
 
 
|-
 
|-
| + [int]
 
 
| concat
 
| concat
| pops [int] strings off the stack, concatenates them, and pushes the result
+
| pops two strings off the stack, concatenates them, and pushes the result
 
|-
 
|-
  +
| newchunk
| { [int]
 
  +
| push an empty chunk onto the stack
| chunk
 
| pops [int] items off the stack and puts them into a chunk (there are currently problems with this command)
 
 
|-
 
|-
| p
+
| newlf
  +
| push an empty lexical form onto the stack
  +
|-
  +
| lemma
  +
| adds the string on top of the stack as pseudolemma to the chunk below it, or as lemma to the lexical form below it
  +
|-
  +
| addtag
  +
| appends the string (tag?) on the top of the stack as a tag to the chunk or lexical form below it
  +
|-
  +
| addlf
  +
| appends the lexical form on the top of the stack to the chunk below it...??
  +
|-
  +
| [...]
  +
| [do we need more lexical form construction items]
  +
|-
 
| pseudolemma
 
| pseudolemma
 
| pop a chunk off the stack and push its pseudolemma
 
| pop a chunk off the stack and push its pseudolemma
 
|-
 
|-
| (space)
 
| space
 
| push a blank containing a single space onto the stack
 
|-
 
| _ [int]
 
 
| blank
 
| blank
  +
| push an integer n off the stack and push the n-th superblank onto the stack. If n is out of range (e.g. -1), push a blank containing a single space
| push the superblank after position [int] onto the stack
 
 
|}
 
|}

Latest revision as of 17:46, 6 June 2019


The first 2 characters of the file are the length of the longest pattern and the number of rules. Trying to generate a complete transfer machine. Apparently the stack can hold Booleans, integers, strings (also tags?), superblanks (not strings) and chunks. Types should be more carefully discussed. It can also hold strings that represent global variables of different types such as lists (this needs to be carefully specified). Heavily inspired by other stack languages such as FORTH.

Mnemonic Action
rule [int] marks the start of a new rule composed of the next [int] characters
drop drop the top of the stack
dup duplicate the top of the stack
over push on top of the stack a copy of the element just below the top of the stack
swap swap the two topmost elements in the stack
"[string]" pushes the next [int] characters onto the stack as a literal; it can be the name of a var, clip [...] . Stored as a single byte opcode, length, and number of bytes.
[int] pushes the integer onto the stack, stored as as a single-byte opcode followed by a fixed number of bytes
False pushes False onto the stack
True pushes True onto the stack
jump [int] increments the instruction pointer by [int] (stored as a single byte opcode and a fixed number of bytes for the instruction pointer)
jumponfalse pops a bool off the stack, increments instruction pointer by [int] if its false
and pops two booleans off the stack and pushes whether all of them are true
or pops two Boolean off the stack and pushes whether any of them are true
not logically negates top of stack
equal pop two strings off the stack and push True if they are equal and False otherwise
equalfold pop two strings off the stack and push True if they are equal (regardless of case) and False otherwise
isprefix pop two strings off the stack and push a Boolean whether the first string popped is a prefix of the second
isprefixfold pop two strings off the stack and push a Boolean whether the first string popped is a prefix of the second (regardless of case)
issuffix pop two strings off the stack and push a Boolean whether the first string popped is a suffix of the second
issuffixfold pop two strings off the stack and push a Boolean whether the first string popped is a suffix of the second (regardless of case)
issubstring pop two strings off the stack and push a Boolean whether the first string popped is a substring of the second
issubstringfold pop two strings off the stack and push a Boolean whether the first string popped is a substring of the second (regardless of case)
hasprefix push whether a prefix of the second string popped is in the list named by the first string popped
hasprefixfold push whether a prefix of the second string popped is in the list named by the first string popped (regardless of case)
hassuffix push whether a suffix of the second string popped is in the list named by the first string popped
hassuffixfold push whether a suffix of the second string popped is in the list named by the first string popped (regardless of case)
in push whether the second string popped is a member of the list named by the first string
infold push whether the second string popped is a member of the list named by the first string
fetch replaces var or clip name on top of the stack with the value contained in the clip or var named.
store stores the value of the string below the top of the stack onto the var or clip named by the top of the stack, and pops both values
printchunk pops one chunk off the stack and appends it to the output queue (in recursive mode, the output queue is later passed back to the rule applier)
printlf prints one lexical form off the stack and appends it to the output queue (...)
property pops one string off the stack and then an integer n off the stack, and stores the name of the n-th property [???]
getcase pops a string off the stack, pushes string "AA", "Aa", or "aa" depending on its case
applycase pops one string off the stack, and modifies the top of stack according to the case pattern of the string popped
concat pops two strings off the stack, concatenates them, and pushes the result
newchunk push an empty chunk onto the stack
newlf push an empty lexical form onto the stack
lemma adds the string on top of the stack as pseudolemma to the chunk below it, or as lemma to the lexical form below it
addtag appends the string (tag?) on the top of the stack as a tag to the chunk or lexical form below it
addlf appends the lexical form on the top of the stack to the chunk below it...??
[...] [do we need more lexical form construction items]
pseudolemma pop a chunk off the stack and push its pseudolemma
blank push an integer n off the stack and push the n-th superblank onto the stack. If n is out of range (e.g. -1), push a blank containing a single space