Difference between revisions of "Apertium stream format"

From Apertium
Jump to navigation Jump to search
Line 4: Line 4:
 
==Special characters==
 
==Special characters==
   
* <code><nowiki>*</nowiki></code> -- Unanalysed word.
+
* Asterisk '<code><nowiki>*</nowiki></code></span>' -- Unanalysed word.
* <code><nowiki>@</nowiki></code> -- Untranslated lemma.
+
* At sign '<code><nowiki>@</nowiki></code>' -- Untranslated lemma.
* <code><nowiki>#</nowiki></code>
+
* Hash sign '<code><nowiki>#</nowiki></code>'
 
** In morphological generation -- Unable to generate [[surface form]] from [[lexical unit]].
 
** In morphological generation -- Unable to generate [[surface form]] from [[lexical unit]].
 
** In morphological analysis -- Start of inconditional part of multiword marker.
 
** In morphological analysis -- Start of inconditional part of multiword marker.
* <code><nowiki>+</nowiki></code> --
+
* Plus symbol '<code><nowiki>+</nowiki></code>' --
 
 
 
==Analyses==
 
==Analyses==

Revision as of 13:51, 15 April 2008

This page describes the stream format used in the Apertium machine translation platform.

Special characters

  • Asterisk '*' -- Unanalysed word.
  • At sign '@' -- Untranslated lemma.
  • Hash sign '#'
    • In morphological generation -- Unable to generate surface form from lexical unit.
    • In morphological analysis -- Start of inconditional part of multiword marker.
  • Plus symbol '+' --

Analyses

S = surface form, L = lemma.


^vino/vino<n><m><sg>/venir<vblex><ifi><p3><sg>$

   |    | |________|
   S    L    TAGS
        |______|
        ANALISIS

|_____________________________________________|
          AMBIGUOUS LEXICAL UNIT

^vino<n><m><sg>$

|______________|
 DISAMBIGUATED
  LEXICAL UNIT

^dímelo/decir<vblex><imp><p2><sg>+me<prn><enc><p1><mf><sg>+lo<prn><enc><p3><nt>/decir<vblex><imp><p2><sg>+me<prn><enc><p1><mf><sg>+lo<prn><enc><p3><m><sg>$

                                 |____________________________________________|
                                                JOINED MORPHEMES

^take it away/take<vblex><sep><inf>+prpers<prn><obj><p3><nt><sg># away/take<vblex><sep><pres>+prpers<prn><obj><p3><nt><sg># away$

Chunks


^Verbcj<SV><vblex><ifi><p3><sg>{^come<vblex><ifi><p3><sg>$}$ ^pr<PREP>{^to<pr>$}$ ^det_nom<SN><f><sg>{^the<det><def><3>$ ^beach<n><3>$}$

   |   |______________________||__________________________|                                                          |
 CHUNK      CHUNK TAGS              LEXICAL UNITS IN                                                               LINKED
  NAME                                  THE CHUNK                                                                   TAG

   |________________________________________|
                       |
                     CHUNK

See also