Talk:Reordering superblanks

From Apertium
Revision as of 08:44, 26 May 2014 by Unhammer (talk | contribs)
Jump to navigation Jump to search

Ensuring transfer rules output all regular superblanks

A separate, but related problem (see earlier discussion from 2009) is that transfer rules some times forget to include all (regular) superblanks from the input. This can of course mess up HTML, and it is frustrating that the developer has to ensure all rules have the right number of <b pos="N"/>, e.g. for a three-lu pattern we need to output both <b pos="1"/> and <b pos="2"/>.

This could be done mechanically by transfer at runtime instead of by the rule writer. Any rule will match a certain number of lu's, with one (super)blank between each lu (currently available in the b elements), and the action part will output a certain number of lu's.

  • For a 1-pattern rule, there can be no superblanks between patterns, so there are no superblanks to output. This is the simple case.
  • For a 2-pattern rule, there is exactly one superblank between patterns. Now we have to run the rule, and look at the output before printing it.
    • If output contains zero or one chunks, put the superblank after the output.
    • If output contains two or more chunks, put the superblank after the first chunk.
  • Generalising this, look at the output, and interleave chunks and superblanks, that is:
    • Read the first chunk, print that chunk, print the first superblank
    • Read the second chunk, print that chunk, print the second superblank
    • Etc. until all chunks are read, print remaining superblanks.

This can be made backwards compatible with existing rule files, by simply ignoring any existing <b> elements that have the pos attribute.

However, this solution does not help with the blanks-in-chunks problem. The Reordering superblanks#Possible solution, however, would.