Difference between revisions of "User:Firespeaker/HFST bug"
Jump to navigation
Jump to search
Firespeaker (talk | contribs) |
Firespeaker (talk | contribs) |
||
Line 19: | Line 19: | ||
== Testing == |
== Testing == |
||
=== Some correctly analysed forms === |
|||
* <code>$ echo "erke" | hfst-proc test.hfst.ol</code> |
* <code>$ echo "erke" | hfst-proc test.hfst.ol</code> |
||
: <code>^erke/erke$</code> |
|||
* <code>$ echo "erke me" | hfst-proc test.hfst.ol </code> |
* <code>$ echo "erke me" | hfst-proc test.hfst.ol </code> |
||
: <code>^erke me/erke me$</code> |
|||
* <code>$ echo "medvedev" | hfst-proc test.hfst.ol</code> |
* <code>$ echo "medvedev" | hfst-proc test.hfst.ol</code> |
||
: <code>^medvedev/medvedev$</code> |
|||
=== The incorrectly analysed form === |
|||
* <code>$ echo "erke medvedev" | hfst-proc test.hfst.ol</code> |
* <code>$ echo "erke medvedev" | hfst-proc test.hfst.ol</code> |
||
: <code>^erke medvedev/<span style="color: red">*</span>erke medvedev$</code> |
|||
=== Expected output === |
|||
This form is analysed with a transducer with the "erke me" form in it: |
|||
* <code>$ echo "erke medvedev" | hfst-proc test2.hfst.ol</code> |
|||
: <code>^erke/erke$ ^medvedev/medvedev$</code> |
Revision as of 07:35, 17 January 2013
In 2011, a bug in how HFST handles words containing spaces was documented and resolved, but it introduced a new bug. This page documents the new behaviour.
Contents
text.lexc
Multichar_Symbols % LEXICON Root erke:erke # ; erke% me:erke% me # ; medvedev:medvedev # ;
Compiling
$ hfst-lexc test.lexc -o test.hfst
$ hfst-invert test.hfst | hfst-fst2fst -w -o test.hfst.ol
Testing
Some correctly analysed forms
$ echo "erke" | hfst-proc test.hfst.ol
^erke/erke$
$ echo "erke me" | hfst-proc test.hfst.ol
^erke me/erke me$
$ echo "medvedev" | hfst-proc test.hfst.ol
^medvedev/medvedev$
The incorrectly analysed form
$ echo "erke medvedev" | hfst-proc test.hfst.ol
^erke medvedev/*erke medvedev$
Expected output
This form is analysed with a transducer with the "erke me" form in it:
$ echo "erke medvedev" | hfst-proc test2.hfst.ol
^erke/erke$ ^medvedev/medvedev$