Difference between revisions of "Talk:Reordering superblanks"
Line 1: | Line 1: | ||
==Ensuring transfer rules output all regular superblanks== |
==Ensuring transfer rules output all regular superblanks== |
||
Transfer rules some times forget to include all (regular) superblanks from the input (see earlier [https://sourceforge.net/p/apertium/mailman/apertium-stuff/thread/20cf28cd0904300204v45f35e51i118f4d146f83748@mail.gmail.com/ discussion from 2009]). This can of course mess up HTML, and it is frustrating that the developer has to ensure all rules have the right number of <code><nowiki><b pos="N"/></nowiki></code>, e.g. for a three-lu pattern we need to output both <code><nowiki><b pos="1"/></nowiki></code> and <code><nowiki><b pos="2"/></nowiki></code>. |
|||
This could be done mechanically by transfer at runtime instead of by the rule writer. Any rule will match a certain number of lu's, with one (super)blank between each lu (currently available in the b elements), and the action part will output a certain number of lu's. |
This could be done mechanically by transfer at runtime instead of by the rule writer. Any rule will match a certain number of lu's, with one (super)blank between each lu (currently available in the b elements), and the action part will output a certain number of lu's. |
||
Line 14: | Line 14: | ||
This can be made backwards compatible with existing rule files, by simply ignoring any existing <b> elements that have the pos attribute. |
This can be made backwards compatible with existing rule files, by simply ignoring any existing <b> elements that have the pos attribute. |
||
However, this solution does not help with the blanks-in-chunks problem. The [[Reordering superblanks#Possible solution]], however, would. |
However, this solution does '''not''' help with the blanks-in-chunks problem. The [[Reordering superblanks#Possible solution]], however, would. |
Revision as of 09:05, 26 May 2014
Ensuring transfer rules output all regular superblanks
Transfer rules some times forget to include all (regular) superblanks from the input (see earlier discussion from 2009). This can of course mess up HTML, and it is frustrating that the developer has to ensure all rules have the right number of <b pos="N"/>
, e.g. for a three-lu pattern we need to output both <b pos="1"/>
and <b pos="2"/>
.
This could be done mechanically by transfer at runtime instead of by the rule writer. Any rule will match a certain number of lu's, with one (super)blank between each lu (currently available in the b elements), and the action part will output a certain number of lu's.
- For a 1-pattern rule, there can be no superblanks between patterns, so there are no superblanks to output. This is the simple case.
- For a 2-pattern rule, there is exactly one superblank between patterns. Now we have to run the rule, and look at the output before printing it.
- If output contains zero or one chunks, put the superblank after the output.
- If output contains two or more chunks, put the superblank after the first chunk.
- Generalising this, look at the output, and interleave chunks and superblanks, that is:
- Read the first chunk, print that chunk, print the first superblank
- Read the second chunk, print that chunk, print the second superblank
- Etc. until all chunks are read, print remaining superblanks.
This can be made backwards compatible with existing rule files, by simply ignoring any existing <b> elements that have the pos attribute.
However, this solution does not help with the blanks-in-chunks problem. The Reordering superblanks#Possible solution, however, would.