Difference between revisions of "LRLM"
Jump to navigation
Jump to search
(Created page with ''''LRLM''' is short for ''Left-to-Right, Longest-Match'', the parsing strategy used by <code>lt-proc</code> of lttoolbox in analysis mode. Basically, it means: read tokens fr…') |
|||
(4 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
'''LRLM''' is short for ''Left-to-Right, Longest-Match'', the parsing strategy used by <code>lt-proc</code> of [[lttoolbox]] in analysis |
'''LRLM''' is short for ''Left-to-Right, Longest-Match'', the parsing strategy used by <code>lt-proc</code> of [[lttoolbox]] in analysis and bilingual modes, as well as [[hfst]]-proc. Basically, it means: read tokens from left to right, matching the longest sequence that is in the dictionary (like "greedy" matching of regular expressions). |
||
LRLM is also used for structural transfer, so if input is a determiner followed by a noun, and there are rules for "det", "n", and "det n", the "det n" rule will match. |
|||
Another term for ''longest-match'' is ''[https://en.wikipedia.org/wiki/Maximal_munch Maximal Munch]''. |
|||
[[Category:Development]] |
[[Category:Development]] |
||
[[Category:Lttoolbox]] |
[[Category:Lttoolbox]] |
||
[[Category:Documentation in English]] |
|||
[[Category:Tokenisation]] |
Latest revision as of 08:20, 24 February 2023
LRLM is short for Left-to-Right, Longest-Match, the parsing strategy used by lt-proc
of lttoolbox in analysis and bilingual modes, as well as hfst-proc. Basically, it means: read tokens from left to right, matching the longest sequence that is in the dictionary (like "greedy" matching of regular expressions).
LRLM is also used for structural transfer, so if input is a determiner followed by a noun, and there are rules for "det", "n", and "det n", the "det n" rule will match.
Another term for longest-match is Maximal Munch.