Difference between revisions of "Apertium system architecture"

From Apertium
Jump to navigation Jump to search
(16 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
== The pipeline ==
 
== The pipeline ==
[[File:Apertium_system_architecture.png|1000px]]
+
[[File:Apertium_system_architecture.png|1200px]]
   
 
== The stages ==
 
== The stages ==
Line 14: Line 14:
 
!colspan="2"| morphological tagger
 
!colspan="2"| morphological tagger
 
| 2004
 
| 2004
|
+
|
 
| <code>xxx-yyy-tagger</code>, <code>xxx-tagger</code>
 
| <code>xxx-yyy-tagger</code>, <code>xxx-tagger</code>
  +
| —
|
 
 
|-
 
|-
 
!colspan="2"| morphological analysis
 
!colspan="2"| morphological analysis
Line 30: Line 30:
 
| [[Constraint Grammar]]
 
| [[Constraint Grammar]]
 
|-
 
|-
!colspan="2"| discontiguous multiword processing
+
!colspan="2"| discontiguous multiword assembly (optional)
| 2017, in progress
+
| 2017
 
| <code>apertium-xxx-yyy.xxx-yyy.lsx</code>
 
| <code>apertium-xxx-yyy.xxx-yyy.lsx</code>
 
| <code>xxx-yyy-autoseq</code>
 
| <code>xxx-yyy-autoseq</code>
Line 48: Line 48:
 
| [[Lexical selection]]
 
| [[Lexical selection]]
 
|-
 
|-
  +
!colspan="2"| anaphora resolution (optional)
!rowspan="3"| structural transfer
 
  +
| 2019, in progress
  +
| <code>apertium-xxx-yyy.xxx-yyy.arx</code>
  +
| <code>xxx-yyy-anaphora</code>
  +
| [[Anaphora Resolution Module]]
 
|-
  +
!colspan="2"| pre-transfer
 
|
  +
| —
  +
| <code>xxx-yyy-pretransfer</code>
  +
| —
 
|-
 
!rowspan="3"| shallow structural transfer
 
! chunker
 
! chunker
  +
| 2006
|
 
 
| <code>apertium-xxx-yyy.xxx-yyy.t1x</code>
 
| <code>apertium-xxx-yyy.xxx-yyy.t1x</code>
  +
| <code>xxx-yyy-chunker</code>
|
 
  +
|rowspan="3" | [[Contributing to an existing pair#Adding structural transfer (grammar) rules]]
|
 
 
|-
 
|-
 
! interchunk
 
! interchunk
  +
| 2006
|
 
 
| <code>apertium-xxx-yyy.xxx-yyy.t2x</code>
 
| <code>apertium-xxx-yyy.xxx-yyy.t2x</code>
  +
| <code>xxx-yyy-interchunk</code>
|
 
|
 
 
|-
 
|-
 
! postchunk
 
! postchunk
  +
| 2006
|
 
 
| <code>apertium-xxx-yyy.xxx-yyy.t3x</code>
 
| <code>apertium-xxx-yyy.xxx-yyy.t3x</code>
  +
| <code>xxx-yyy-postchunk</code>
|
 
|
 
 
|-
 
|-
!colspan="2"| reverse discontiguous multiword processing
+
!colspan="2"| recursive structural transfer
| 2017, in progress
+
| 2019, in progress
  +
| <code>apertium-xxx-yyy.xxx-yyy.rtx</code>
  +
| <code>xxx-yyy-rectransfer</code>
  +
| [[Apertium-recursive]]
 
|-
  +
!colspan="2"| discontiguous multiword disassembly (optional)
  +
| 2017
 
| <code>apertium-xxx-yyy.yyy-xxx.lsx</code>
 
| <code>apertium-xxx-yyy.yyy-xxx.lsx</code>
 
| <code>xxx-yyy-revautoseq</code>
 
| <code>xxx-yyy-revautoseq</code>
Line 76: Line 92:
 
| 2004
 
| 2004
 
| <code>apertium-yyy.yyy.lexc</code> and<br /><code>apertium-yyy.yyy.twol</code> and<br /><code>apertium-yyy.yyy.twoc</code>,<br />OR <code>apertium-yyy.yyy.dix</code>
 
| <code>apertium-yyy.yyy.lexc</code> and<br /><code>apertium-yyy.yyy.twol</code> and<br /><code>apertium-yyy.yyy.twoc</code>,<br />OR <code>apertium-yyy.yyy.dix</code>
  +
| <code>xxx-yyy-dgen</code> or <code>xxx-yyy-gener</code> or <code>xxx-yyy-generador</code>
|
 
 
|
 
|
 
|-
 
|-
Line 82: Line 98:
 
| 2004
 
| 2004
 
| <code>apertium-yyy.post-yyy.dix</code>
 
| <code>apertium-yyy.post-yyy.dix</code>
  +
| <code>xxx-yyy-pgen</code>
|
 
  +
| [[Post-generator]]
|
 
 
|}
 
|}
   
Line 89: Line 105:
 
deformatter, reformatter
 
deformatter, reformatter
   
=== Example translation at each stage ===
+
== Example translation at each stage ==
  +
  +
Note that this example includes what all modules would do, even though the English-Spanish pair does not currently use all of them.
  +
  +
=== Input text ===
  +
John said he took the big plant out to the yard.
  +
  +
=== Morphological analyzer ===
  +
  +
^John/John<np><ant><m><sg>$ ^said/say<vblex><past>/say<vblex><pp>$ ^he/prpers<prn><subj><p3><m><sg>$ ^took/take<vblex><past>$ ^the/the<det><def><sp>$ ^big/big<adj><sint>$ ^plant/plant<n><sg>/plant<vblex><inf>/plant<vblex><pres>/plant<vblex><imp>$ ^out/out<adv>/out<pr>$ ^to/to<pr>$ ^the/the<det><def><sp>$ ^yard/yard<n><sg>$^./.<sent>$
  +
  +
=== Morphological disambiguator ===
  +
^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$ ^out<adv>$ ^to<pr>$ ^the<det><def><sp>$ ^yard<n><sg>$^./.<sent>$
  +
  +
=== Discontiguous multiword processing ===
  +
^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take# out<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$^./.<sent>$
  +
  +
=== Lexical transfer ===
  +
^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>/fábrica<n><f><sg>/maquinaria<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$
  +
  +
=== Lexical selection ===
  +
^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$
  +
  +
=== Anaphora resolution ===
  +
  +
^John<np><ant><m><sg>/John<np><ant><m><sg>/$ ^say<vblex><past>/decir<vblex><past>/$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>/John<np><ant><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^big<adj><sint>/grande<adj><mf>/$ ^plant<n><sg>/planta<n><f><sg>/$ ^to<pr>/a<pr>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^yard<n><sg>/patio<n><m><sg>/$^.<sent>/.<sent>/$
  +
  +
=== Structural transfer ===
  +
^John<np><ant><m><sg>$ ^decir<vblex><ifi><p3><sg>$ ^que<cnjsub>$ ^sacar<vblex><ifi><p3><sg>$ ^el<det><def><f><sg>$ ^planta<n><f><sg>$ ^grande<adj><mf><sg>$ ^a<pr>$ ^el<det><def><m><sg>$ ^patio<n><m><sg>$^.<sent>$
  +
  +
=== Morphological generator ===
  +
John dijo que sacó la planta grande ~a el patio.
  +
  +
=== Post-generator ===
  +
  +
John dijo que sacó la planta grande al patio.
   
 
== See also ==
 
== See also ==

Revision as of 23:00, 18 May 2020

The pipeline

Apertium system architecture.png

The stages

Linguistic data

stage introduced filenames mode documentation
morphological tagger 2004 xxx-yyy-tagger, xxx-tagger
morphological analysis 2004 apertium-xxx.xxx.lexc and
apertium-xxx.xxx.twol and
apertium-xxx.xxx.twoc,
OR apertium-xxx.xxx.dix
xxx-yyy-morph, xxx-morph
morphological disambiguation 2004, 2008 apertium-xxx.xxx.rlx xxx-yyy-disam, xxx-disam Constraint Grammar
discontiguous multiword assembly (optional) 2017 apertium-xxx-yyy.xxx-yyy.lsx xxx-yyy-autoseq Apertium separable
lexical transfer 2004 apertium-xxx-yyy.xxx-yyy.dix xxx-yyy-biltrans Bilingual dictionary
lexical selection 2012 apertium-xxx-yyy.xxx-yyy.lrx xxx-yyy-lex Lexical selection
anaphora resolution (optional) 2019, in progress apertium-xxx-yyy.xxx-yyy.arx xxx-yyy-anaphora Anaphora Resolution Module
pre-transfer xxx-yyy-pretransfer
shallow structural transfer chunker 2006 apertium-xxx-yyy.xxx-yyy.t1x xxx-yyy-chunker Contributing to an existing pair#Adding structural transfer (grammar) rules
interchunk 2006 apertium-xxx-yyy.xxx-yyy.t2x xxx-yyy-interchunk
postchunk 2006 apertium-xxx-yyy.xxx-yyy.t3x xxx-yyy-postchunk
recursive structural transfer 2019, in progress apertium-xxx-yyy.xxx-yyy.rtx xxx-yyy-rectransfer Apertium-recursive
discontiguous multiword disassembly (optional) 2017 apertium-xxx-yyy.yyy-xxx.lsx xxx-yyy-revautoseq Apertium separable
morphological generation 2004 apertium-yyy.yyy.lexc and
apertium-yyy.yyy.twol and
apertium-yyy.yyy.twoc,
OR apertium-yyy.yyy.dix
xxx-yyy-dgen or xxx-yyy-gener or xxx-yyy-generador
post-generation 2004 apertium-yyy.post-yyy.dix xxx-yyy-pgen Post-generator

Apertium-internal

deformatter, reformatter

Example translation at each stage

Note that this example includes what all modules would do, even though the English-Spanish pair does not currently use all of them.

Input text

John said he took the big plant out to the yard.

Morphological analyzer

^John/John<np><ant><m><sg>$ ^said/say<vblex><past>/say<vblex><pp>$ ^he/prpers<prn><subj><p3><m><sg>$ ^took/take<vblex><past>$ ^the/the<det><def><sp>$ ^big/big<adj><sint>$ ^plant/plant<n><sg>/plant<vblex><inf>/plant<vblex><pres>/plant<vblex><imp>$ ^out/out<adv>/out<pr>$ ^to/to<pr>$ ^the/the<det><def><sp>$ ^yard/yard<n><sg>$^./.<sent>$

Morphological disambiguator

^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$ ^out<adv>$ ^to<pr>$ ^the<det><def><sp>$ ^yard<n><sg>$^./.<sent>$

Discontiguous multiword processing

^John<np><ant><m><sg>$ ^say<vblex><past>$ ^prpers<prn><subj><p3><m><sg>$ ^take# out<vblex><past>$ ^the<det><def><sp>$ ^big<adj><sint>$ ^plant<n><sg>$^./.<sent>$

Lexical transfer

^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>/fábrica<n><f><sg>/maquinaria<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$

Lexical selection

^John<np><ant><m><sg>/John<np><ant><m><sg>$ ^say<vblex><past>/decir<vblex><past>$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^big<adj><sint>/grande<adj><mf>$ ^plant<n><sg>/planta<n><f><sg>$ ^to<pr>/a<pr>$ ^the<det><def><sp>/el<det><def><GD><ND>$ ^yard<n><sg>/patio<n><m><sg>$^.<sent>/.<sent>$

Anaphora resolution

^John<np><ant><m><sg>/John<np><ant><m><sg>/$ ^say<vblex><past>/decir<vblex><past>/$ ^prpers<prn><subj><p3><m><sg>/prpers<prn><tn><p3><m><sg>/John<np><ant><m><sg>$ ^take# out<vblex><past>/sacar<vblex><past>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^big<adj><sint>/grande<adj><mf>/$ ^plant<n><sg>/planta<n><f><sg>/$ ^to<pr>/a<pr>/$ ^the<det><def><sp>/el<det><def><GD><ND>/$ ^yard<n><sg>/patio<n><m><sg>/$^.<sent>/.<sent>/$

Structural transfer

^John<np><ant><m><sg>$ ^decir<vblex><ifi><p3><sg>$ ^que<cnjsub>$ ^sacar<vblex><ifi><p3><sg>$ ^el<det><def><f><sg>$ ^planta<n><f><sg>$ ^grande<adj><mf><sg>$ ^a<pr>$ ^el<det><def><m><sg>$ ^patio<n><m><sg>$^.<sent>$

Morphological generator

John dijo que sacó la planta grande ~a el patio.

Post-generator

John dijo que sacó la planta grande al patio.

See also