Difference between revisions of "ATT format"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
'''ATT format''' is a transducer format based on a four-column layout. It is a tab separated four-column format. |
'''ATT format''' is a transducer format based on a four-column layout. It is a tab separated four-column format. |
||
Both lttoolbox and HFST can read ATT format as input to compile dictionaries (lt-comp, hfst-txt2fst), and print compiled dictionaries to ATT format (lt-print, hfst-fst2txt). |
|||
⚫ | |||
⚫ | |||
Say we want to represent the following transducer: |
Say we want to represent the following transducer: |
||
Line 8: | Line 9: | ||
We can do it thusly: |
We can do it thusly: |
||
<pre> |
<pre> |
||
$ cat test.dix |
$ cat test.dix |
||
<dictionary> |
<dictionary> |
||
Line 33: | Line 32: | ||
3 4 t ε |
3 4 t ε |
||
4 |
4 |
||
</pre> |
</pre> |
||
Revision as of 14:18, 10 March 2014
ATT format is a transducer format based on a four-column layout. It is a tab separated four-column format.
Both lttoolbox and HFST can read ATT format as input to compile dictionaries (lt-comp, hfst-txt2fst), and print compiled dictionaries to ATT format (lt-print, hfst-fst2txt).
Example
Say we want to represent the following transducer:
We can do it thusly:
$ cat test.dix <dictionary> <alphabet>abcdefghijklmnopqrstuvwxyz</alphabet> <sdefs> <sdef n="n"/> </sdefs> <section id="main" type="standard"> <e><p><l>test</l><r>foo</r></p></e> </section> </dictionary> $ lt-comp lr test.dix test.bin main@standard 5 4 $ lt-print test.bin 0 1 t f 1 2 e o 2 3 s o 3 4 t ε 4