Difference between revisions of "Apertium system architecture"
Jump to navigation
Jump to search
Popcorndude (talk | contribs) (→Example translation at each stage: add anaphora) |
Firespeaker (talk | contribs) |
||
| Line 1: | Line 1: | ||
The Apertium RBMT system uses a "pipeline", with individual stages performing specific tasks before that stage's output is passed along to the next stage. The page provides an [[overview of the pipeline|#The pipeline]], information about [[how the data in each stage is stored|The stages]], and an [[example of data being passed through the pipeline|#Example translation at each stage]] to create an output translation. |
|||
== The pipeline == |
== The pipeline == |
||
[[File:Apertium_system_architecture.png|1200px]] |
[[File:Apertium_system_architecture.png|1200px]] |
||
Revision as of 17:53, 11 June 2020
The Apertium RBMT system uses a "pipeline", with individual stages performing specific tasks before that stage's output is passed along to the next stage. The page provides an #The pipeline, information about The stages, and an #Example translation at each stage to create an output translation.
The pipeline
The stages
Linguistic data
| stage | introduced | filenames | mode | documentation | |
|---|---|---|---|---|---|
| morphological tagger | 2004 | — | xxx-yyy-tagger, xxx-tagger
|
— | |
| morphological analysis | 2004 | apertium-xxx.xxx.lexc andapertium-xxx.xxx.twol andapertium-xxx.xxx.twoc,OR apertium-xxx.xxx.dix
|
xxx-yyy-morph, xxx-morph
|
||
| morphological disambiguation | 2004, 2008 | apertium-xxx.xxx.rlx
|
xxx-yyy-disam, xxx-disam
|
Constraint Grammar | |
| discontiguous multiword assembly (optional) | 2017 | apertium-xxx-yyy.xxx-yyy.lsx
|
xxx-yyy-autoseq
|
Apertium separable | |
| lexical transfer | 2004 | apertium-xxx-yyy.xxx-yyy.dix
|
xxx-yyy-biltrans
|
Bilingual dictionary | |
| lexical selection | 2012 | apertium-xxx-yyy.xxx-yyy.lrx
|
xxx-yyy-lex
|
Lexical selection | |
| anaphora resolution (optional) | 2019, in progress | apertium-xxx-yyy.xxx-yyy.arx
|
xxx-yyy-anaphora
|
Anaphora Resolution Module | |
| pre-transfer | — | xxx-yyy-pretransfer
|
— | ||
| shallow structural transfer | chunker | 2006 | apertium-xxx-yyy.xxx-yyy.t1x
|
xxx-yyy-chunker
|
Contributing to an existing pair#Adding structural transfer (grammar) rules |
| interchunk | 2006 | apertium-xxx-yyy.xxx-yyy.t2x
|
xxx-yyy-interchunk
| ||
| postchunk | 2006 | apertium-xxx-yyy.xxx-yyy.t3x
|
xxx-yyy-postchunk
| ||
| recursive structural transfer | 2019, in progress | apertium-xxx-yyy.xxx-yyy.rtx
|
xxx-yyy-rectransfer
|
Apertium-recursive | |
| discontiguous multiword disassembly (optional) | 2017 | apertium-xxx-yyy.yyy-xxx.lsx
|
xxx-yyy-revautoseq
|
Apertium separable | |
| morphological generation | 2004 | apertium-yyy.yyy.lexc andapertium-yyy.yyy.twol andapertium-yyy.yyy.twoc,OR apertium-yyy.yyy.dix
|
xxx-yyy-dgen or xxx-yyy-gener or xxx-yyy-generador
|
||
| post-generation | 2004 | apertium-yyy.post-yyy.dix
|
xxx-yyy-pgen
|
Post-generator | |
Apertium-internal
deformatter, reformatter
Example translation at each stage
Note that this example includes what all modules would do, even though the English-Spanish pair does not currently use all of them.
Input text
John said he took the big plant out to the yard.
Morphological analyzer
^John/John<np><ant><m><sg>$ ^said/say<vblex><past>/say<vblex><pp>$ ^he/prpers<prn><subj><p3><m><sg>$ ^took/take<vblex><past>$ ^the/the<det><def><sp>$ ^big/big<adj><sint>$ ^plant/plant<n><sg>/plant<vblex><inf>/plant<vblex><pres>/plant<vblex><imp>$ ^out/out<adv>/out<pr>$ ^to/to<pr>$ ^the/the<det><def><sp>$ ^yard/yard<n><sg>$^./.<sent>$
Morphological disambiguator
^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$ ^out<adv>$ ^to<pr>$ ^the<det><def><sp>$ ^yard<n><sg>$^./.<sent>$
Discontiguous multiword processing
^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take# out<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$^./.<sent>$
Lexical transfer
^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>/fábrica<n><f><sg>/maquinaria<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$
Lexical selection
^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$
Anaphora resolution
^John<np><ant><m><sg>/John<np><ant><m><sg>/$ ^say<vblex><past>/decir<vblex><past>/$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>/John<np><ant><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^big<adj><sint>/grande<adj><mf>/$ ^plant<n><sg>/planta<n><f><sg>/$ ^to<pr>/a<pr>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^yard<n><sg>/patio<n><m><sg>/$^.<sent>/.<sent>/$
Structural transfer
^John<np><ant><m><sg>$ ^decir<vblex><ifi><p3><sg>$ ^que<cnjsub>$ ^sacar<vblex><ifi><p3><sg>$ ^el<det><def><f><sg>$ ^planta<n><f><sg>$ ^grande<adj><mf><sg>$ ^a<pr>$ ^el<det><def><m><sg>$ ^patio<n><m><sg>$^.<sent>$
Morphological generator
John dijo que sacó la planta grande ~a el patio.
Post-generator
John dijo que sacó la planta grande al patio.