Difference between revisions of "Inconditional section"
(Category:Documentation in English) |
|||
Line 1: | Line 1: | ||
==inconditional== |
|||
An inconditional ('unconditional') section of a dictionary typically contains punctuation, and such things. |
An inconditional ('unconditional') section of a dictionary typically contains punctuation, and such things. |
||
Line 36: | Line 38: | ||
^aa/aa<aa>$^aa/aa<aa>$^a/a<aa>$ |
^aa/aa<aa>$^aa/aa<aa>$^a/a<aa>$ |
||
</pre> |
</pre> |
||
== postblank / preblank == |
|||
Note that ''postblank'' and ''preblank'' sections work exactly like ''inconditional'' with respect to how they tokenise the input. The only difference is that anything in a postblank section will make lt-proc output a space after the token (in preblank, before the token). So if "☃" is in postblank (tagged sent), and "foo" and "bar" are in a regular section (tagged n), then we get: |
|||
<pre> |
|||
$ echo 'foo☃bar' | lt-proc analyser.bin |
|||
^foo/foo<n>$^☃/☃<sent>$ ^bar/bar<n>$ |
|||
</pre> |
|||
If "☃" were in preblank, we'd get: |
|||
<pre> |
|||
$ echo 'foo☃bar' | lt-proc analyser.bin |
|||
^foo/foo<n>$ ^☃/☃<sent>$^bar/bar<n>$ |
|||
</pre> |
|||
=== Why is this useful? === |
|||
TODO |
|||
== See also == |
|||
* [[Morphological dictionary]] |
|||
[[Category:Terminology]] |
[[Category:Terminology]] |
Revision as of 10:37, 15 August 2013
inconditional
An inconditional ('unconditional') section of a dictionary typically contains punctuation, and such things.
The main section of a dictionary works on a longest-match basis.
Inconditional means 'if you see this, stop processing immediately and start reading a new word'. Stop when you reach the end of a possible transduction.
You could say that the "only" difference is that a space is not required to start a new match.
$ echo 23men |apertium -d . en-it-anmor ^23/23<num>$^men/man<n><pl>$^./.<sent>$
It doesn't need the space between 23 and men because numbers are in an 'inconditional' section.
<dictionary> <alphabet>ab</alphabet> <sdefs> <sdef n="aa"/> <sdef n="ab"/> </sdefs> <section id="foo" type="inconditional"> <e><p><l>a</l><r>a<s n="aa"/></r></p></e> <e><p><l>aa</l><r>aa<s n="aa"/></r></p></e> </section> </dictionary> $ echo aaa |lt-proc sample.bin ^aa/aa<aa>$^a/a<aa>$ $ echo aaaa |lt-proc sample.bin ^aa/aa<aa>$^aa/aa<aa>$ $ echo aaaaa |lt-proc sample.bin ^aa/aa<aa>$^aa/aa<aa>$^a/a<aa>$
postblank / preblank
Note that postblank and preblank sections work exactly like inconditional with respect to how they tokenise the input. The only difference is that anything in a postblank section will make lt-proc output a space after the token (in preblank, before the token). So if "☃" is in postblank (tagged sent), and "foo" and "bar" are in a regular section (tagged n), then we get:
$ echo 'foo☃bar' | lt-proc analyser.bin ^foo/foo<n>$^☃/☃<sent>$ ^bar/bar<n>$
If "☃" were in preblank, we'd get:
$ echo 'foo☃bar' | lt-proc analyser.bin ^foo/foo<n>$ ^☃/☃<sent>$^bar/bar<n>$
Why is this useful?
TODO