Difference between revisions of "User:Khannatanmai/New Apertium stream format"

From Apertium
Jump to navigation Jump to search
(Created page with "Here I will provide updates about the development of the new Apertium stream format, which will include an arbitrary amount of optional secondary information. All discussions...")
 
Line 3: Line 3:
All discussions on IRC about this can be found in the discussion page of this wiki.
All discussions on IRC about this can be found in the discussion page of this wiki.


== Apertium stream at each module ==
Formalism:

Deformatter
<pre>
Los perros del chico corren rápido..[][
]
</pre>

Morph Analyser
<pre>
^Los/El<det><def><m><pl>/Prpers<prn><pro><p3><m><pl>$ ^perros/perro<n><m><pl>$ ^del/de<pr>+el<det><def><m><sg>$ ^chico/chico<n><m><sg>$ ^corren/correr<vblex><pri><p3><pl>$ ^rápido/rápido<adj><m><sg>$^./.<sent>$^./.<sent>$[][
]
</pre>

POS Tagger
<pre>
^El<det><def><m><pl>$ ^perro<n><m><pl>$ ^de<pr>+el<det><def><m><sg>$ ^chico<n><m><sg>$ ^correr<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>$^.<sent>$^.<sent>$[][
]
</pre>

Pre transfer
<pre>
^El<det><def><m><pl>$ ^perro<n><m><pl>$ ^de<pr>$ ^el<det><def><m><sg>$ ^chico<n><m><sg>$ ^correr<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>$^.<sent>$^.<sent>$[][
]
</pre>

Bidix Lookup
<pre>
^El<det><def><m><pl>/The<det><def><m><pl>$ ^perro<n><m><pl>/dog<n><m><pl>$ ^de<pr>/of<pr>/from<pr>$ ^el<det><def><m><sg>/the<det><def><m><sg>$ ^chico<n><m><sg>/boy<n><sg>$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>$^.<sent>/.<sent>$^.<sent>/.<sent>$[][
]
</pre>

Lexical Selection
<pre>
^El<det><def><m><pl>/The<det><def><m><pl>$ ^perro<n><m><pl>/dog<n><m><pl>$ ^de<pr>/of<pr>/from<pr>$ ^el<det><def><m><sg>/the<det><def><m><sg>$ ^chico<n><m><sg>/boy<n><sg>$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>$^.<sent>/.<sent>$^.<sent>/.<sent>$[][
]
</pre>

Anaphora Resolution
<pre>
^El<det><def><m><pl>/The<det><def><m><pl>/$ ^perro<n><m><pl>/dog<n><m><pl>/$ ^de<pr>/of<pr>/from<pr>/$ ^el<det><def><m><sg>/the<det><def><m><sg>/$ ^chico<n><m><sg>/boy<n><sg>/$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>/$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>/$^.<sent>/.<sent>/$^.<sent>/.<sent>/$[][
]
</pre>

Chunker (t1x)
<pre>
^Det_nom<SN><m><pl>{^the<det><def><3>$ ^dog<n><3>$}$ ^de<PREP>{^of<pr>$}$ ^det_nom<SN><m><sg>{^the<det><def><3>$ ^boy<n><3>$}$ ^verbcj<SV><vblex><pri><p3><pl>{^run<vblex><pres>$}$ ^adj<SA><m><sg>{^fast<adj><sint>$}$^punt<sent>{^.<sent>$}$^punt<sent>{^.<sent>$}$[][
]
</pre>

Interchunk (t2x)
<pre>
^Det_nom<SN><m><pl>{^the<det><def><3>$ ^dog<n><3>$}$ ^de<PREP>{^of<pr>$}$ ^det_nom<SN><m><sg>{^the<det><def><3>$ ^boy<n><3>$}$ ^verbcj<SV><vblex><pri><p3><pl>{^run<vblex><pres>$}$ ^adj<SA><m><sg>{^fast<adj><sint>$}$^punt<sent>{^.<sent>$}$^punt<sent>{^.<sent>$}$[][
]
</pre>

Postchunk (t3x)
<pre>
^The<det><def><pl>$ ^dog<n><pl>$ ^of<pr>$ ^the<det><def><sg>$ ^boy<n><sg>$ ^run<vblex><pres>$ ^fast<adj><sint>$^.<sent>$^.<sent>$[][
]
</pre>

Generator
<pre>
The dogs of the boy run fast..[][
]
</pre>

Post-generator
<pre>
The dogs of the boy run fast..[][
]
</pre>

Reformatter
<pre>
The dogs of the boy run fast.
</pre>

Revision as of 18:53, 10 April 2020

Here I will provide updates about the development of the new Apertium stream format, which will include an arbitrary amount of optional secondary information.

All discussions on IRC about this can be found in the discussion page of this wiki.

Apertium stream at each module

Deformatter

Los perros del chico corren rápido..[][
]

Morph Analyser

^Los/El<det><def><m><pl>/Prpers<prn><pro><p3><m><pl>$ ^perros/perro<n><m><pl>$ ^del/de<pr>+el<det><def><m><sg>$ ^chico/chico<n><m><sg>$ ^corren/correr<vblex><pri><p3><pl>$ ^rápido/rápido<adj><m><sg>$^./.<sent>$^./.<sent>$[][
]

POS Tagger

^El<det><def><m><pl>$ ^perro<n><m><pl>$ ^de<pr>+el<det><def><m><sg>$ ^chico<n><m><sg>$ ^correr<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>$^.<sent>$^.<sent>$[][
]

Pre transfer

^El<det><def><m><pl>$ ^perro<n><m><pl>$ ^de<pr>$ ^el<det><def><m><sg>$ ^chico<n><m><sg>$ ^correr<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>$^.<sent>$^.<sent>$[][
]

Bidix Lookup

^El<det><def><m><pl>/The<det><def><m><pl>$ ^perro<n><m><pl>/dog<n><m><pl>$ ^de<pr>/of<pr>/from<pr>$ ^el<det><def><m><sg>/the<det><def><m><sg>$ ^chico<n><m><sg>/boy<n><sg>$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>$^.<sent>/.<sent>$^.<sent>/.<sent>$[][
]

Lexical Selection

^El<det><def><m><pl>/The<det><def><m><pl>$ ^perro<n><m><pl>/dog<n><m><pl>$ ^de<pr>/of<pr>/from<pr>$ ^el<det><def><m><sg>/the<det><def><m><sg>$ ^chico<n><m><sg>/boy<n><sg>$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>$^.<sent>/.<sent>$^.<sent>/.<sent>$[][
]

Anaphora Resolution

^El<det><def><m><pl>/The<det><def><m><pl>/$ ^perro<n><m><pl>/dog<n><m><pl>/$ ^de<pr>/of<pr>/from<pr>/$ ^el<det><def><m><sg>/the<det><def><m><sg>/$ ^chico<n><m><sg>/boy<n><sg>/$ ^correr<vblex><pri><p3><pl>/run<vblex><pri><p3><pl>/$ ^rápido<adj><m><sg>/fast<adj><sint><m><sg>/$^.<sent>/.<sent>/$^.<sent>/.<sent>/$[][
]

Chunker (t1x)

^Det_nom<SN><m><pl>{^the<det><def><3>$ ^dog<n><3>$}$ ^de<PREP>{^of<pr>$}$ ^det_nom<SN><m><sg>{^the<det><def><3>$ ^boy<n><3>$}$ ^verbcj<SV><vblex><pri><p3><pl>{^run<vblex><pres>$}$ ^adj<SA><m><sg>{^fast<adj><sint>$}$^punt<sent>{^.<sent>$}$^punt<sent>{^.<sent>$}$[][
]

Interchunk (t2x)

^Det_nom<SN><m><pl>{^the<det><def><3>$ ^dog<n><3>$}$ ^de<PREP>{^of<pr>$}$ ^det_nom<SN><m><sg>{^the<det><def><3>$ ^boy<n><3>$}$ ^verbcj<SV><vblex><pri><p3><pl>{^run<vblex><pres>$}$ ^adj<SA><m><sg>{^fast<adj><sint>$}$^punt<sent>{^.<sent>$}$^punt<sent>{^.<sent>$}$[][
]

Postchunk (t3x)

^The<det><def><pl>$ ^dog<n><pl>$ ^of<pr>$ ^the<det><def><sg>$ ^boy<n><sg>$ ^run<vblex><pres>$ ^fast<adj><sint>$^.<sent>$^.<sent>$[][
]

Generator

The dogs of the boy run fast..[][
]

Post-generator

The dogs of the boy run fast..[][
]

Reformatter

The dogs of the boy run fast.