Inconditional section

From Apertium
Revision as of 11:44, 7 October 2014 by Bech (talk | contribs) (Link to French page)
Jump to navigation Jump to search

En français


An inconditional ('unconditional') section of a dictionary typically contains punctuation, and such things.

The main section of a dictionary works on a longest-match basis.

Inconditional means 'if you see this, stop processing immediately and start reading a new word'. Stop when you reach the end of a possible transduction.

You could say that the "only" difference is that a space is not required to start a new match.

$ echo 23men |apertium -d . en-it-anmor

It doesn't need the space between 23 and men because numbers are in an 'inconditional' section.

    <sdef n="aa"/>
    <sdef n="ab"/>
  <section id="foo" type="inconditional">
    <e><p><l>a</l><r>a<s n="aa"/></r></p></e>
    <e><p><l>aa</l><r>aa<s n="aa"/></r></p></e>

$ echo aaa |lt-proc  sample.bin

$ echo aaaa |lt-proc  sample.bin

$ echo aaaaa |lt-proc  sample.bin

postblank / preblank

The postblank and preblank sections work exactly like inconditional with respect to how they tokenise the input. The only difference is that anything in a postblank section will make lt-proc output a space after the token (in preblank, before the token).

So if "☃" is in postblank (tagged sent), and "foo" and "bar" are in a regular section (tagged n), then we get:

$ echo 'foo☃bar' | lt-proc analyser.bin
^foo/foo<n>$^☃/☃<sent>$ ^bar/bar<n>$

If "☃" were in preblank, we'd get:

$ echo 'foo☃bar' | lt-proc analyser.bin
^foo/foo<n>$ ^☃/☃<sent>$^bar/bar<n>$

Why is this useful?


See also