Difference between revisions of "List of symbols"

From Apertium
Jump to navigation Jump to search
(Link to French page)
(41 intermediate revisions by 11 users not shown)
Line 1: Line 1:
[[Liste des symboles|En français]]
+
[[Liste de symboles|En français]] · [[Список символов|по-русски]]
   
 
This page lists the symbols in Apertium used to denote part-of-speech and further morphological features, as well as chunk tags used for more syntactic functions, as well as XML tags.
 
This page lists the symbols in Apertium used to denote part-of-speech and further morphological features, as well as chunk tags used for more syntactic functions, as well as XML tags.
  +
   
 
{{TOCD}}
 
{{TOCD}}
Line 11: Line 12:
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal POS
  +
|-
  +
| <code>n</code> || Noun || ''see 'np' for proper noun'' || NOUN
  +
|-
  +
| <code>vblex</code> || Standard ("lexical") verb || ''see also: vbser, vbhaver, vbmod, vaux'' || VERB
 
|-
 
|-
| <code>n</code> || Noun || ''see 'np' for proper noun''
+
| <code>v</code> || Standard verb || shortened form of vblex, often used in agglutinative languages || VERB
 
|-
 
|-
| <code>vblex</code> || Standard verb || ''see also: vbser, vbhaver, vbmod, vaux''
+
| <code>vbmod</code> || Modal verb || || VERB
 
|-
 
|-
| <code>vbmod</code> || Modal verb ||
+
| <code>vbser</code> || Verb "to be" || from ''ser'' (to be) || VERB (or AUX)
 
|-
 
|-
| <code>vbser</code> || Verb "to be" || from ''ser'' (to be)
+
| <code>vbhaver</code> || Verb "to have" || from ''haver'' (to have) || VERB
 
|-
 
|-
| <code>vbhaver</code> || Verb "to have" || from ''haver'' (to have)
+
| <code>vaux</code> || Auxiliary verb || [http://en.wikipedia.org/wiki/Auxilliary_verb wikipedia] || AUX
 
|-
 
|-
| <code>vaux</code> || Auxilliary verb || [http://en.wikipedia.org/wiki/Auxilliary_verb wikipedia]
+
| <code>cop</code> || Copula || [http://en.wikipedia.org/wiki/Copula_(linguistics) wikipedia]; sometimes verb-like, sometimes not || AUX, ...
 
|-
 
|-
| <code>adj</code> || Adjective ||
+
| <code>adj</code> || Adjective || || ADJ
 
|-
 
|-
| <code>post</code> || Postposition ||
+
| <code>adv</code> || Adverb || || ADV
 
|-
 
|-
| <code>adv</code> || Adverb ||
+
| <code>preadv</code> || Pre-adverb || || ADV
 
|-
 
|-
| <code>preadv</code> || Pre-adverb ||
+
| <code>postadv</code> || Post-adverb || || ADV
 
|-
 
|-
| <code>postadv</code> || Post-adverb ||
+
| <code>mod</code> || Modal word || [http://dic.academic.ru/dic.nsf/lingvistic/749] || PART
 
|-
 
|-
| <code>mod</code> || Модальное слово || [http://dic.academic.ru/dic.nsf/lingvistic/749]
+
| <code>det</code> || Determiner || [http://en.wikipedia.org/wiki/Determiner_(class) wikipedia] || DET
 
|-
 
|-
| <code>det</code> || Determiner || [http://en.wikipedia.org/wiki/Determiner_(class) wikipedia]
+
| <code>prn</code> || Pronoun || [http://en.wikipedia.org/wiki/Pronoun wikipedia] || PRON
 
|-
 
|-
| <code>prn</code> || Pronoun || [http://en.wikipedia.org/wiki/Pronoun wikipedia]
+
| <code>pr</code> || Preposition || [http://en.wikipedia.org/wiki/Preposition wikipedia] || ADP
 
|-
 
|-
| <code>pr</code> || Preposition || [http://en.wikipedia.org/wiki/Preposition wikipedia]
+
| <code>post</code> || Postposition || || ADP
 
|-
 
|-
| <code>num</code> || Numeral ||
+
| <code>num</code> || Numeral || || NUM
 
|-
 
|-
| <code>np</code> || Proper noun || From ''nom propi'' [http://en.wikipedia.org/wiki/Proper_noun wikipedia]
+
| <code>np</code> || Proper noun || From ''nom propi'' [http://en.wikipedia.org/wiki/Proper_noun wikipedia] || PROPN
 
|-
 
|-
| <code>ij</code> || Interjection || [http://en.wikipedia.org/wiki/Interjection wikipedia]
+
| <code>ij</code> || Interjection || [http://en.wikipedia.org/wiki/Interjection wikipedia] || INTJ
 
|-
 
|-
| <code>cnjcoo</code> || Co-ordinating conjunction || [http://en.wikipedia.org/wiki/Co-ordinating_conjunction wikipedia]
+
| <code>cnjcoo</code> || Co-ordinating conjunction || [http://en.wikipedia.org/wiki/Co-ordinating_conjunction wikipedia] || CCONJ
 
|-
 
|-
| <code>cnjsub</code> || Sub-ordinating conjunction ||
+
| <code>cnjsub</code> || Sub-ordinating conjunction || || SCONJ
 
|-
 
|-
| <code>cnjadv</code> || Conjunctive adverb || [http://en.wikipedia.org/wiki/Conjunctive_adverb wikipedia]
+
| <code>cnjadv</code> || Conjunctive adverb || [http://en.wikipedia.org/wiki/Conjunctive_adverb wikipedia] || SCONJ, ADV
 
|-
 
|-
| <code>sent</code> || Sentence-ending punctuation || e.g. full stop, question mark
+
| <code>sent</code> || Sentence-ending punctuation || e.g. full stop, question mark || PUNCT
 
|-
 
|-
  +
| <code>cm</code> || Comma punctuation || , || PUNCT
  +
|-
  +
| <code>lquot</code> || Left quote || « || PUNCT
  +
|-
  +
| <code>rquot</code> || Right quote || » || PUNCT
  +
|-
  +
| <code>lpar</code> || Left parenthesis || ( || PUNCT
  +
|-
  +
| <code>rpar</code> || Right parenthesis || ) || PUNCT
  +
|-
 
|}
 
|}
   
Line 62: Line 77:
   
 
===Gender===
 
===Gender===
  +
  +
These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal featurs
 
|-
 
|-
| <code>f</code> || Feminine ||
+
| <code>f</code> || Feminine || || Gender=Fem
 
|-
 
|-
| <code>m</code> || Masculine ||
+
| <code>m</code> || Masculine || || Gender=Masc
 
|-
 
|-
| <code>nt</code> || Neuter ||
+
| <code>nt</code> || Neuter || || Gender=Neut
 
|-
 
|-
| <code>ma</code> || Masculine (animate) || Mostly in Slavic languages
+
| <code>ma</code> || Masculine (animate) || Mostly in Slavic languages || Gender=Masc
 
|-
 
|-
| <code>mi</code> || Masculine (inanimate) || Mostly in Slavic languages
+
| <code>mi</code> || Masculine (inanimate) || Mostly in Slavic languages || Gender=Masc
 
|-
 
|-
| <code>mp</code> || Masculine (personal) || in Polish
+
| <code>mp</code> || Masculine (personal) || in Polish || Gender=Masc
 
|-
 
|-
| <code>mn</code> || Masculine and Neuter ||
+
| <code>mn</code> || Masculine or neuter || || Gender=Masc,Neut
 
|-
 
|-
| <code>fn</code> || Feminine and Neuter ||
+
| <code>fn</code> || Feminine or neuter || || Gender=Fem,Neut
 
|-
 
|-
| <code>ut</code> || Common || From ''utrum'', found in Scandinavian languages.
+
| <code>mf</code> || Masculine or feminine || This is used where the gender can be either masculine or feminine || Gender=Masc,Fem
 
|-
 
|-
| <code>mf</code> || Masculine , feminine || This is used where the gender can be either masculine or feminine
+
| <code>mfn</code> || Masculine , feminine , neuter || This is used where the gender can be either masculine, feminine or neuter || Gender=Masc,Fem,Neut
 
|-
 
|-
| <code>mfn</code> || Masculine , feminine , neuter || This is used where the gender can be either masculine, feminine or neuter
+
| <code>ut</code> || Common || From ''utrum'', found in Scandinavian languages. || Gender=Com
 
|-
 
|-
| <code>un</code> || Common, neuter || As above, only common or neuter
+
| <code>un</code> || Common or neuter || As above, only common or neuter || Gender=Com,Neut
 
|-
 
|-
 
| <code>GD</code> || Gender to be determined ||
 
| <code>GD</code> || Gender to be determined ||
Line 95: Line 112:
   
 
===Count/Mass===
 
===Count/Mass===
  +
  +
These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>cnt</code> || Countable ||
+
| <code>cnt</code> || Countable ||
 
|-
 
|-
| <code>unc</code> || Uncountable (mass) ||
+
| <code>unc</code> || Uncountable (mass) ||
 
|-
 
|-
 
|}
 
|}
   
===Number===
+
===Animacy===
  +
  +
These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>sg</code> || Singular ||
+
| <code>aa</code> || Animate ||
 
|-
 
|-
| <code>pl</code> || Plural ||
+
| <code>an</code> || Animate or inanimate ||
 
|-
 
|-
| <code>du</code> || Dual ||
+
| <code>nn</code> || Inanimate ||
 
|-
 
|-
  +
|}
| <code>ct</code> || Count || see mk-bg
 
  +
  +
===Adjectives===
  +
  +
{|class=wikitable
  +
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
  +
| <code>sint</code> || Synthetic || "nice, nicer, nicest" is synthetic. "handsome, more handsome, the most handsome" is not. [http://en.wikipedia.org/wiki/Synthetic_language wikipedia]
| <code>coll</code> || Collective ||
 
 
|-
 
|-
| <code>sp</code> || Singular , plural ||
+
| <code>preadj</code> || Pre-adjective || for languages where most of adjectives are after the noun (ex: French in eo->fr bidix)
 
|-
 
|-
| <code>ND</code> || Number to be determined ||
+
| <code>preadj_nh</code> || Pre-adjective if not human || according to the noun, the adjective is before or after
 
|-
 
|-
 
|}
 
|}
   
===Case===
+
===Pronoun types ===
   
{|class=wikitable
+
{| class="wikitable" border="1"
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>nom</code> || Nominative ||
+
| <code>pers</code> || Personal || || PronType=Prs
 
|-
 
|-
| <code>acc</code> || Accusative ||
+
| <code>tn</code> || Tónico ||
 
|-
 
|-
| <code>dat</code> || Dative ||
+
| <code>detnt</code> || Neuter determiner || POS? || DET
 
|-
 
|-
| <code>gen</code> || Genitive ||
+
| <code>predet</code> || Pre determiner || POS? || DET
 
|-
 
|-
| <code>dg</code> || Dative and Genitive || in [[ro-es]], discouraged in new developments
+
| <code>atn</code> || Atónico ||
 
|-
 
|-
| <code>voc</code> || Vocative ||
+
| <code>qnt</code> || Quantifier || || PronType=Ind
 
|-
 
|-
| <code>abl</code> || Ablative ||
+
| <code>ord</code> || Ordinal || || NumType=Ord
 
|-
 
|-
| <code>ins</code> || Instrumental || [http://en.wikipedia.org/wiki/Instrumental_case wikipedia]
+
| <code>obj</code> || Object ||
 
|-
 
|-
| <code>loc</code> || Locative || [http://en.wikipedia.org/wiki/Locative wikipedia]
+
| <code>subj</code> || Subject ||
 
|-
 
|-
| <code>abl</code> || Ablative || [http://en.wikipedia.org/wiki/Ablative wikipedia]
+
| <code>pro</code> || Proclitic ||
 
|-
 
|-
| <code>prp</code> || Prepositional || [http://en.wikipedia.org/wiki/Prepositional wikipedia]
+
| <code>enc</code> || Enclitic ||
 
|-
 
|-
| <code>tra</code> || Translative ||
+
| <code>acr</code> || Acronym || Not Pronuon? || Abbr=Yes
 
|-
 
|-
| <code>ill</code> || Illative ||
+
| <code>rel</code> || Relative || || PronType=Rel
 
|-
 
|-
| <code>ine</code> || Inessive ||
+
| <code>ind</code> || Indefinite || || PronType=Ind
 
|-
 
|-
| <code>ade</code> || Adessive ||
+
| <code>itg</code> || Interrogative || || PronType=Int
 
|-
 
|-
| <code>all</code> || Allative ||
+
| <code>dem</code> || Demonstrative || || PronType=Dem
 
|-
 
|-
| <code>abe</code> || Abessive ||
+
| <code>def</code> || Definite ||
 
|-
 
|-
| <code>ess</code> || Essive ||
+
| <code>pos</code> || Possessive || || Poss=Yes
 
|-
 
|-
| <code>par</code> || Partitive ||
+
| <code>ref</code> || Reflexive || || Reflex=Yes
 
|-
 
|-
| <code>dis</code> || Distributive ||
+
| <code>prx</code> || Proximate ||
 
|-
 
|-
| <code>com</code> || Comitative ||
+
| <code>dst</code> || Distal ||
  +
|}
  +
  +
=== Transitivity ===
  +
  +
Used for verbs.
  +
  +
{| class="wikitable" border="1"
  +
! Symbol !! Gloss !! Notes !! Universal feature
  +
|-
  +
| <code>tv</code> || Transitive || takes direct object in accusative case (used in Turkic)
 
|-
 
|-
| <code>soc</code> || Sociative ||
+
| <code>iv</code> || Intransitive || does not take direct object in accusative case (used in Turkic)
 
|-
 
|-
| <code>prl</code> || Prolative ||
+
| <code>TD</code> || Transitivity to be determined || if the sub-category is [currently] unknown
 
|}
 
|}
   
  +
== Inflectional morphology ==
===Voice===
 
  +
  +
===Number===
  +
Note: number can be a sub-category tag too, e.g. with pronouns.
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>actv</code> || Active voice ||
+
| <code>sg</code> || Singular || || Number=Sing
 
|-
 
|-
| <code>pasv</code>,<code>pass</code> || Passive voice || {{tag|pass}} is more used in Turkic, {{tag|pasv}} in Germanic.
+
| <code>pl</code> || Plural || || Number=Plur
 
|-
 
|-
| <code>midv</code> || Middle voice ||
+
| <code>sp</code> || Singular or plural || || Number=Sing,Plur
 
|-
 
|-
| <code>nactv</code> || Non-active voice || See Albanian.
+
| <code>du</code> || Dual || || Number=Dual
  +
|-
  +
| <code>ct</code> || Count || see mk-bg || Number=Count
  +
|-
  +
| <code>coll</code> || Collective || || Number=Coll
  +
|-
  +
| <code>ND</code> || Number to be determined ||
 
|-
 
|-
 
|}
 
|}
   
  +
===Tense and mode===
 
  +
===Case===
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>pres</code> || Present ||
+
| <code>nom</code> || Nominative || || Case=Nom
 
|-
 
|-
| <code>past</code> || Past ||
+
| <code>acc</code> || Accusative || || Case=Acc
 
|-
 
|-
| <code>imp</code> || Imperative ||
+
| <code>dat</code> || Dative || || Case=Dat
 
|-
 
|-
| <code>inf</code> || Infinitive ||
+
| <code>gen</code> || Genitive || || Case=Gen
 
|-
 
|-
| <code>pp</code> || Past participle || [http://en.wikipedia.org/wiki/Participle wikipedia]
+
| <code>dg</code> || Dative and Genitive || in [[ro-es]], discouraged in new developments || Case=Dat,Gen
 
|-
 
|-
| <code>pp2</code> || Past participle (???) || It's at least used in the Esperanto dictionaries for future active participles, ''ont'' (seems quite odd)
+
| <code>voc</code> || Vocative || || Case=Voc
 
|-
 
|-
| <code>pp3</code> || Past participle (???) || It's at least used in the Esperanto dictionaries for past active participles, ''int'' (seems quite odd)
+
| <code>abl</code> || Ablative || [http://en.wikipedia.org/wiki/Ablative wikipedia] || Case=Abl
 
|-
 
|-
| <code>pprs</code> || Present participle || Also appears as <code>ppres</code> (deprecated)
+
| <code>ins</code> || Instrumental or Instructive || [http://en.wikipedia.org/wiki/Instrumental_case wikipedia] || Case=Ins
 
|-
 
|-
| <code>ger</code> || Gerund || [http://en.wikipedia.org/wiki/Gerund wikipedia]
+
| <code>loc</code> || Locative || [http://en.wikipedia.org/wiki/Locative wikipedia] || Case=Loc
 
|-
 
|-
| <code>pri</code> || Present indicative || ''see also: pres''. [http://en.wikipedia.org/wiki/Present_indicative wikipedia]
+
| <code>prp</code> || Prepositional || [http://en.wikipedia.org/wiki/Prepositional wikipedia]
 
|-
 
|-
| <code>pii</code> || Imperfect || from ''Pretério imperfecto de indicativo''
+
| <code>tra</code> || Translative || || Case=Tra
 
|-
 
|-
| <code>fti</code> || Future indicative ||
+
| <code>ill</code> || Illative || || Case=Ill
 
|-
 
|-
| <code>fts</code> || Future subjunctive ||
+
| <code>ine</code> || Inessive || || Case=Ine
 
|-
 
|-
| <code>cni</code> || Conditional ||
+
| <code>ade</code> || Adessive || || Case=Ade
 
|-
 
|-
| <code>plu</code> || Pluperfect || In <code>cy-en</code>
+
| <code>all</code> || Allative || || Case=All
 
|-
 
|-
| <code>pmp</code> || Pluperfect || In <code>es-gl</code> (from ''Pluscamperfecto'')
+
| <code>abe</code> || Abessive || || Case=Abe
 
|-
 
|-
| <code>prs</code> || Present subjunctive || [http://en.wikipedia.org/wiki/Present_subjunctive wikipedia]
+
| <code>ess</code> || Essive || || Case=Ess
 
|-
 
|-
| <code>pis</code> || Imperfect subjunctive ||
+
| <code>par</code> || Partitive || || Case=Par
 
|-
 
|-
| <code>ifi</code> || Past definite || from ''Pretério perfecto o indefinido''
+
| <code>dis</code> || Distributive || || Case=Dis
 
|-
 
|-
| <code>aff</code> || Affirmative ||
+
| <code>com</code> || Comitative || || Case=Com
 
|-
 
|-
| <code>itg</code> || Interrogative ||
+
| <code>soc</code> || Sociative || ||
 
|-
 
|-
| <code>neg</code> || Negative ||
+
| <code>prl</code> || Prolative || || Case=Pro
 
|-
 
|-
  +
| <code>ses</code> || Superessive || [[Hungarian]] || Case=Sup
|}
 
  +
|-
 
  +
| <code>sub</code> || Sublative || [[Hungarian]] || Case=Sub
===Derivations===
 
{|class=wikitable
 
! Symbol !! Gloss !! Notes
 
 
|-
 
|-
| <code>caus</code> || Causative ||
+
| <code>dela</code> || Delative || [[Hungarian]] || Case=Del
 
|-
 
|-
  +
| <code>term</code> || Terminative || [[Hungarian]], Estonian, ... ||
 
|}
 
|}
   
===Possession===
+
===Voice===
  +
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>px1sg</code> || First person singular possessive || e.g. in [[Turkic languages]]
+
| <code>actv</code> || Active voice || || Voice=Act
 
|-
 
|-
| <code>px2sg</code> || Second person singular possessive || e.g. in [[Turkic languages]]
+
| <code>pass</code> || Passive voice || is more used in Turkic. || Voice=Pass
 
|-
 
|-
| <code>px3sg</code> || Third person singular possessive || e.g. in [[Turkic languages]]
+
| <code>pasv</code> || Passive voice || is more used in Germanic. || Voice=PAss
 
|-
 
|-
| <code>px1pl</code> || First person plural possessive || e.g. in [[Turkic languages]]
+
| <code>midv</code> || Middle voice || || Voice=Mid
 
|-
 
|-
| <code>px2pl</code> || Second person plural possessive || e.g. in [[Turkic languages]]
+
| <code>nactv</code> || Non-active voice || See Albanian. ||
 
|-
 
|-
| <code>px3pl</code> || Third person plural possessive || e.g. in [[Turkic languages]]
+
| <code>caus</code> || Causative voice || see also [[#Derivations]] || Voice=Cau
|-
 
| <code>px3sp</code> || Third person possessive singular/plural || e.g. in [[Turkic languages]]
 
 
|-
 
|-
 
|}
 
|}
   
===Proper nouns===
+
===Tense and mode===
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal features
 
|-
 
|-
| <code>ant</code> || Anthroponym || [http://en.wikipedia.org/wiki/Anthroponym wikipedia]
+
| <code>pres</code> || Present || || Tense=Pres
 
|-
 
|-
| <code>top</code> || Toponym || In some language pairs without the locative case this may be ''loc''. Although this should be changed. [http://en.wikipedia.org/wiki/Toponym wikipedia]
+
| <code>pret</code> || Preterite || [https://en.wikipedia.org/wiki/Preterite Preterite] || Tense=Past
 
|-
 
|-
| <code>hyd</code> || Hydronym || [http://en.wikipedia.org/wiki/Hydronym wikipedia]
+
| <code>past</code> || Past || || Tense=Past
 
|-
 
|-
| <code>cog</code> || Cognomen || In normal use, surnames
+
| <code>imp</code> || Imperative || [http://www.englishlanguageguide.com/grammar/imperative.asp englishlanguageguide] || Mood=Imp
 
|-
 
|-
| <code>org</code> || Organisation ||
+
| <code>inf</code> || Infinitive || [https://en.wikipedia.org/wiki/Infinitive wikipedia] || VerbForm=Inf
  +
|-
  +
| <code>aor</code> || Aorist || [https://en.wikipedia.org/wiki/Aorist wikipedia] A tense in Turkic languages. || Tense=Past
  +
|-
  +
| <code>pp</code> || Past participle || [http://en.wikipedia.org/wiki/Participle wikipedia] || VerbForm=Part
  +
|-
  +
| <code>pp2</code> || Past participle (???) || It's at least used in the Esperanto dictionaries for future active participles, ''ont'' (seems quite odd) ||
  +
|-
  +
| <code>pp3</code> || Past participle (???) || It's at least used in the Esperanto dictionaries for past active participles, ''int'' (seems quite odd) ||
  +
|-
  +
| <code>pprs</code> || Present participle || Also appears as <code>ppres</code> (deprecated) || VerbForm=Part
  +
|-
  +
| <code>ger</code> || Gerund || [http://en.wikipedia.org/wiki/Gerund wikipedia] || VerbForm=Ger
  +
|-
  +
| <code>supn</code> || Supine || [http://en.wikipedia.org/wiki/Supine wikipedia] || VerbForm=Sup
  +
|-
  +
| <code>pri</code> || Present indicative || ''see also: pres''. [http://en.wikipedia.org/wiki/Present_indicative wikipedia] || Tense=Pres Mood=Ind
  +
|-
  +
| <code>pii</code> || Imperfect || from ''Pretério imperfecto de indicativo'' [https://en.wikipedia.org/wiki/Imperfect wikipedia] || Tense=Past Mood=Ind
  +
|-
  +
| <code>fti</code> || Future indicative || || Tense=Fut Mood=Ind
  +
|-
  +
| <code>fts</code> || Future subjunctive || || Tense=Fut Mood=Sub
  +
|-
  +
| <code>cni</code> || Conditional || Lot of pairs will probably use cnd or cond... || Mood=Cnd
  +
|-
  +
| <code>plu</code> || Pluperfect || In <code>cy-en</code> || Tense=Pqp
  +
|-
  +
| <code>pmp</code> || Pluperfect || In <code>es-gl</code> (from ''Pluscamperfecto'') || Tense=Pqp
  +
|-
  +
| <code>prs</code> || Present subjunctive || [http://en.wikipedia.org/wiki/Present_subjunctive wikipedia] || Tense=Pres Mood=Sub
  +
|-
  +
| <code>pis</code> || Imperfect subjunctive || || Tense=Past Mood=Sub
  +
|-
  +
| <code>ifi</code> || Past definite || from ''Pretério perfecto o indefinido'' || Tense=Past Definite=Def
  +
|-
  +
| <code>aff</code> || Affirmative || [https://en.wikipedia.org/wiki/Affirmation_and_negation wikipedia] || Polarity=Pos
  +
|-
  +
| <code>itg</code> || Interrogative || ||
  +
|-
  +
| <code>neg</code> || Negative || || Polarity=Neg
  +
|-
  +
| <code>lp</code> || L-participle ||
 
|-
 
|-
| <code>al</code> || Altres || Other, misc.
 
 
|}
 
|}
   
 
===Person===
 
===Person===
  +
Note: person can be a sub-category tag, e.g. with pronouns.
   
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
| <code>p1</code> || First person ||
+
| <code>p1</code> || First person || || Person=1
 
|-
 
|-
| <code>p2</code> || Second person ||
+
| <code>p2</code> || Second person || || Person=2
 
|-
 
|-
| <code>p3</code> || Third person ||
+
| <code>p3</code> || Third person || || Person=3
 
|-
 
|-
| <code>impers</code> || Impersonal || Sometimes called 'autonomous'
+
| <code>impers</code> || Impersonal || Sometimes called 'autonomous' || Person=0
 
|-
 
|-
 
|}
 
|}
   
===Animacy===
+
===Derivations===
 
 
{|class=wikitable
 
{|class=wikitable
 
! Symbol !! Gloss !! Notes
 
! Symbol !! Gloss !! Notes
 
|-
 
|-
| <code>aa</code> || Animate ||
+
| <code>caus</code> || Causative ||
|-
 
| <code>an</code> || Animate / inanimate ||
 
|-
 
| <code>nn</code> || Inanimate ||
 
 
|-
 
|-
  +
| <code>ingr</code> || Ingressive || https://nn.wikipedia.org/w/index.php?title=Ingressiv
 
|}
 
|}
   
===Adjectives===
+
===Possession===
 
 
{|class=wikitable
 
{|class=wikitable
! Symbol !! Gloss !! Notes
+
! Symbol !! Gloss !! Notes !! Universal feature
 
|-
 
|-
  +
| <code>px1sg</code> || First person singular possessive || e.g. in [[Turkic languages]] || Person[psor]=1 Number[psor]=Sing
| <code>sint</code> || Synthetic || "nice, nicer, nicest" is synthetic. "handsome, more handsome, the most handsome" is not. [http://en.wikipedia.org/wiki/Synthetic_language wikipedia]
 
 
|-
 
|-
| <code>pst</code> || Positive ||
+
| <code>px2sg</code> || Second person singular possessive || e.g. in [[Turkic languages]] || Person[psor]=2 Number[psor]=Sing
 
|-
 
|-
  +
| <code>px3sg</code> || Third person singular possessive || e.g. in [[Turkic languages]] || Person[psor]=3 Number[psor]=Sing
| <code>comp</code> || Comparative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
 
 
|-
 
|-
  +
| <code>px1pl</code> || First person plural possessive || e.g. in [[Turkic languages]] || Person[psor]=1 Number[psor]=Plur
| <code>sup</code> || Superlative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
 
 
|-
 
|-
  +
| <code>px2pl</code> || Second person plural possessive || e.g. in [[Turkic languages]] || Person[psor]=2 Number[psor]=Plur
| <code>attr</code> || Attributive || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
 
 
|-
 
|-
  +
| <code>px3pl</code> || Third person plural possessive || e.g. in [[Turkic languages]] || Person[psor]=3 Number[psor]=Plur
| <code>pred</code> || Predicative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
 
  +
|-
  +
| <code>px3sp</code> || Third person possessive singular or plural || e.g. in [[Turkic languages]] || Person[psor]=3
 
|-
 
|-
 
|}
 
|}
   
  +
===Object marking===
{| class="wikitable" border="1"
 
  +
! Symbol !! Gloss !! Notes
 
  +
e.g. in verbs with both
  +
  +
{|class=wikitable
  +
! Symbol !! Gloss !! Notes !! Universal features
 
|-
 
|-
| <code>tn</code> || Tónico
+
| <code>o_sg1</code> || First person singular object ||
 
|-
 
|-
| <code>detnt</code> || Neuter determiner
+
| <code>o_sg2</code> || Second person singular object ||
 
|-
 
|-
| <code>predet</code> || Pre determiner
+
| <code>o_sg3</code> || Third person singular object ||
 
|-
 
|-
| <code>atn</code> || Atónico
+
| <code>o_pl1</code> || First person plural object ||
 
|-
 
|-
| <code>qnt</code> || Quantifier
+
| <code>o_pl2</code> || Second person plural object ||
 
|-
 
|-
| <code>ord</code> || Ordinal
+
| <code>o_pl3</code> || Third person plural object ||
 
|-
 
|-
  +
|}
| <code>obj</code> || Object
 
  +
  +
===Proper nouns===
  +
  +
{|class=wikitable
  +
! Symbol !! Gloss !! Notes !! Universal features
 
|-
 
|-
  +
| <code>ant</code> || Anthroponym || [http://en.wikipedia.org/wiki/Anthroponym wikipedia], it's very common to use ant together with f and m for traditionally gender-specific names
| <code>subj</code> || Subject
 
 
|-
 
|-
  +
| <code>top</code> || Toponym || In some language pairs without the locative case this may be ''loc''. Although this should be changed. [http://en.wikipedia.org/wiki/Toponym wikipedia]
| <code>pro</code> || Proclitic
 
 
|-
 
|-
  +
| <code>hyd</code> || Hydronym || [http://en.wikipedia.org/wiki/Hydronym wikipedia]
| <code>enc</code> || Enclitic
 
 
|-
 
|-
| <code>acr</code> || Acronym
+
| <code>cog</code> || Cognomen || In normal use, surnames
 
|-
 
|-
| <code>rel</code> || Relative
+
| <code>org</code> || Organisation ||
 
|-
 
|-
| <code>ind</code> || Indefinite
+
| <code>al</code> || Altres || Other, misc.
  +
|}
  +
  +
===Adjectives===
  +
  +
{|class=wikitable
  +
! Symbol !! Gloss !! Notes !! Universal features
 
|-
 
|-
| <code>itg</code> || Interrogative
+
| <code>pst</code> || Positive || || Degree=Pos
 
|-
 
|-
  +
| <code>comp</code> || Comparative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia] || Degree=Comp
| <code>dem</code> || Demonstrative
 
 
|-
 
|-
  +
| <code>sup</code> || Superlative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia] || Degree=Sup
| <code>def</code> || Definite
 
 
|-
 
|-
  +
| <code>attr</code> || Attributive || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
| <code>pos</code> || Possesive
 
 
|-
 
|-
  +
| <code>pred</code> || Predicative || [http://en.wikipedia.org/wiki/Adjective#Attributive.2C_predicative.2C_absolute.2C_and_substantive_adjectives wikipedia]
| <code>ref</code> || Reflexive
 
  +
|}
  +
  +
  +
===Others===
  +
{|class=wikitable
  +
! Symbol !! Gloss !! Notes
 
|-
 
|-
| <code>prx</code> || Proximate
+
| <code>web</code> || Links and Emails ||
 
|-
 
|-
| <code>dst</code> || Distal
 
 
|}
 
|}
   
<!-- There's no such section –Flam
 
 
===See also===
 
===See also===
  +
* [[Turkic lexicon|Guidelines for tag assignment (etc.) in Turkic]]
[[Turkic_languages#Tagset|Tag set for Turkic languages]]
 
  +
* [[Tagging guidelines for Portuguese]]
-->
 
  +
 
==Chunk tags==
 
==Chunk tags==
   
Line 399: Line 502:
   
 
==XML tags==
 
==XML tags==
Note: All XML tags are explained in depth in the PDF [[documentation]], see also the dix.dtd/dix.rng files in [http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox/lttoolbox/ lttoolbox (svn)].
+
Note: All XML tags are explained in depth in the PDF [[documentation]], see also the [https://github.com/apertium/lttoolbox/blob/master/lttoolbox/dix.dtd dix.dtd] and [https://github.com/apertium/lttoolbox/blob/master/lttoolbox/dix.rng dix.rng] files in the GitHub repository.
   
 
{|class=wikitable
 
{|class=wikitable
! XML tag !! Means !! Appears in XML tags / notes / examples
+
! XML tag !! Means !! Appears in XML tags / notes / examples
 
|-
 
|-
| <code><dictionary></code> || Mono- or bilingual dictionary || In files apertium-eo-en.en.dix, apertium-eo-en.eo-en.dix, apertium-eo-en.post-en.dix, apertium-eo-en.post-eo.dix
+
| <code><dictionary></code> || Mono- or bilingual dictionary || In files apertium-eo-en.en.dix, apertium-eo-en.eo-en.dix, apertium-eo-en.post-en.dix, apertium-eo-en.post-eo.dix
 
|-
 
|-
| <code><alphabet></code> || Set of characters in the language|| In <code><dictionary></code>
+
| <code><alphabet></code> || Set of characters in the language|| In <code><dictionary></code>
 
|-
 
|-
| <code><sdefs></code> || Symbol definitions || In <code>&lt;dictionary></code>
+
| <code><sdefs></code> || Symbol definitions || In <code>&lt;dictionary></code>
 
|-
 
|-
 
| <code><sdef></code> || Symbol definition || In <code>&lt;sdefs></code>. Ex: <code>&lt;sdef n="noun"/></code>
 
| <code><sdef></code> || Symbol definition || In <code>&lt;sdefs></code>. Ex: <code>&lt;sdef n="noun"/></code>
 
|-
 
|-
| <code><pardefs></code> || Paradigm definitions || In <code>&lt;dictionary></code>.
+
| <code><pardefs></code> || Paradigm definitions || In <code>&lt;dictionary></code>.
 
|-
 
|-
| <code><pardef></code> || Paradigm definition || In <code>&lt;pardefs></code>.
+
| <code><pardef></code> || Paradigm definition || In <code>&lt;pardefs></code>.
 
|-
 
|-
| <code><section> </code> || A section of the dictionary || In <code>&lt;dictionary></code>. Ex: <code>&lt;section id="main" type="standard"></code>
+
| <code>&lt;section></code> || A section of the dictionary || In <code>&lt;dictionary></code>. Ex: <code>&lt;section id="main" type="standard"></code>
 
|-
 
|-
| <code>&lt;e></code> || A dictionary entry (a word) || In <code><section></code> and in <code>&lt;pardef></code>.
+
| <code>&lt;e></code> || A dictionary entry (a word) || In <code>&lt;section></code> and in <code>&lt;pardef></code>.
 
|-
 
|-
| <code>&lt;i></code> || Invariant (left and right side) || In <code>&lt;e></code>. Ex.: <code>&lt;i>beer&lt;/i></code>
+
| <code>&lt;i></code> || Invariant (left and right side) || In <code>&lt;e></code>. Ex.: <code>&lt;i>beer&lt;/i></code>
 
|-
 
|-
 
| <code>&lt;p></code> || A pair || In <code><e></code>.
 
| <code>&lt;p></code> || A pair || In <code><e></code>.
Line 457: Line 560:
   
 
{|class=wikitable
 
{|class=wikitable
! XML attribute value !! Means !! Appears in attribute || Notes
+
! XML attribute value !! Means !! Appears in attribute || Notes
 
|-
 
|-
| <code>whole</code> || lemma and grammatical symbols || part
+
| <code>whole</code> || lemma and grammatical symbols || part
 
|-
 
|-
| <code>lem</code> || lemma || part
+
| <code>lem</code> || lemma || part
 
|-
 
|-
| <code>lemh</code> || (inflected) head word of [[Chunking:_A_full_example#Handling_of_multiwords_with_inner_inflection|multiword]] || part
+
| <code>lemh</code> || (inflected) head word of [[Chunking:_A_full_example#Handling_of_multiwords_with_inner_inflection|multiword]] || part
 
|-
 
|-
| <code>lemq</code> || following queue of [[Chunking:_A_full_example#Handling_of_multiwords_with_inner_inflection|multiword]] || part
+
| <code>lemq</code> || following queue of [[Chunking:_A_full_example#Handling_of_multiwords_with_inner_inflection|multiword]] || part
 
|-
 
|-
 
|}
 
|}

Revision as of 14:07, 5 October 2019

En français · по-русски

This page lists the symbols in Apertium used to denote part-of-speech and further morphological features, as well as chunk tags used for more syntactic functions, as well as XML tags.


This is meant to be a glossary of symbol names in alphabetical order with notes. Some of these names are specific to particular packages or language pairs, as not all languages have the same grammatical features (most don't have spatial distinction in articles for example).

If you were wondering what the symbols #, /, @, +, ~ or * mean, read Apertium stream format.

Part-of-speech Categories

Symbol Gloss Notes Universal POS
n Noun see 'np' for proper noun NOUN
vblex Standard ("lexical") verb see also: vbser, vbhaver, vbmod, vaux VERB
v Standard verb shortened form of vblex, often used in agglutinative languages VERB
vbmod Modal verb VERB
vbser Verb "to be" from ser (to be) VERB (or AUX)
vbhaver Verb "to have" from haver (to have)  VERB
vaux Auxiliary verb wikipedia  AUX
cop Copula wikipedia; sometimes verb-like, sometimes not  AUX, ...
adj Adjective  ADJ
adv Adverb  ADV
preadv Pre-adverb  ADV
postadv Post-adverb  ADV
mod Modal word [1] PART
det Determiner wikipedia  DET
prn Pronoun wikipedia  PRON
pr Preposition wikipedia ADP
post Postposition ADP
num Numeral NUM
np Proper noun From nom propi wikipedia  PROPN
ij Interjection wikipedia INTJ
cnjcoo Co-ordinating conjunction wikipedia CCONJ
cnjsub Sub-ordinating conjunction SCONJ
cnjadv Conjunctive adverb wikipedia  SCONJ, ADV
sent Sentence-ending punctuation e.g. full stop, question mark PUNCT
cm Comma punctuation ,  PUNCT
lquot Left quote « PUNCT
rquot Right quote » PUNCT
lpar Left parenthesis (  PUNCT
rpar Right parenthesis )  PUNCT

Part-of-speech Sub-categories

Gender

These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).

Symbol Gloss Notes Universal featurs
f Feminine Gender=Fem
m Masculine Gender=Masc
nt Neuter Gender=Neut
ma Masculine (animate) Mostly in Slavic languages Gender=Masc
mi Masculine (inanimate) Mostly in Slavic languages Gender=Masc
mp Masculine (personal) in Polish Gender=Masc
mn Masculine or neuter  Gender=Masc,Neut
fn Feminine or neuter Gender=Fem,Neut
mf Masculine or feminine This is used where the gender can be either masculine or feminine Gender=Masc,Fem
mfn Masculine , feminine , neuter This is used where the gender can be either masculine, feminine or neuter Gender=Masc,Fem,Neut
ut Common From utrum, found in Scandinavian languages. Gender=Com
un Common or neuter As above, only common or neuter Gender=Com,Neut
GD Gender to be determined

Count/Mass

These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).

Symbol Gloss Notes Universal feature
cnt Countable
unc Uncountable (mass)

Animacy

These tags are usually used with nouns, and things that agree/concord with nouns (like adjectives and verbs).

Symbol Gloss Notes Universal feature
aa Animate
an Animate or inanimate
nn Inanimate

Adjectives

Symbol Gloss Notes Universal feature
sint Synthetic "nice, nicer, nicest" is synthetic. "handsome, more handsome, the most handsome" is not. wikipedia
preadj Pre-adjective for languages where most of adjectives are after the noun (ex: French in eo->fr bidix)
preadj_nh Pre-adjective if not human according to the noun, the adjective is before or after

Pronoun types

Symbol Gloss Notes Universal feature
pers Personal  PronType=Prs
tn Tónico
detnt Neuter determiner POS?  DET
predet Pre determiner POS?  DET
atn Atónico
qnt Quantifier  PronType=Ind
ord Ordinal  NumType=Ord
obj Object
subj Subject
pro Proclitic
enc Enclitic
acr Acronym Not Pronuon?  Abbr=Yes
rel Relative  PronType=Rel
ind Indefinite  PronType=Ind
itg Interrogative  PronType=Int
dem Demonstrative PronType=Dem
def Definite
pos Possessive Poss=Yes
ref Reflexive Reflex=Yes
prx Proximate
dst Distal

Transitivity

Used for verbs.

Symbol Gloss Notes Universal feature
tv Transitive takes direct object in accusative case (used in Turkic)
iv Intransitive does not take direct object in accusative case (used in Turkic)
TD Transitivity to be determined if the sub-category is [currently] unknown

Inflectional morphology

Number

Note: number can be a sub-category tag too, e.g. with pronouns.

Symbol Gloss Notes Universal feature
sg Singular Number=Sing
pl Plural Number=Plur
sp Singular or plural Number=Sing,Plur
du Dual  Number=Dual
ct Count see mk-bg Number=Count
coll Collective Number=Coll
ND Number to be determined


Case

Symbol Gloss Notes Universal feature
nom Nominative  Case=Nom
acc Accusative  Case=Acc
dat Dative Case=Dat
gen Genitive  Case=Gen
dg Dative and Genitive in ro-es, discouraged in new developments Case=Dat,Gen
voc Vocative  Case=Voc
abl Ablative wikipedia Case=Abl
ins Instrumental or Instructive wikipedia Case=Ins
loc Locative wikipedia Case=Loc
prp Prepositional wikipedia
tra Translative Case=Tra
ill Illative  Case=Ill
ine Inessive Case=Ine
ade Adessive Case=Ade
all Allative Case=All
abe Abessive  Case=Abe
ess Essive Case=Ess
par Partitive  Case=Par
dis Distributive Case=Dis
com Comitative  Case=Com
soc Sociative
prl Prolative  Case=Pro
ses Superessive Hungarian Case=Sup
sub Sublative Hungarian Case=Sub
dela Delative Hungarian Case=Del
term Terminative Hungarian, Estonian, ...

Voice

Symbol Gloss Notes Universal feature
actv Active voice  Voice=Act
pass Passive voice is more used in Turkic. Voice=Pass
pasv Passive voice is more used in Germanic. Voice=PAss
midv Middle voice  Voice=Mid
nactv Non-active voice See Albanian.
caus Causative voice see also #Derivations Voice=Cau

Tense and mode

Symbol Gloss Notes Universal features
pres Present  Tense=Pres
pret Preterite Preterite Tense=Past
past Past Tense=Past
imp Imperative englishlanguageguide Mood=Imp
inf Infinitive wikipedia  VerbForm=Inf
aor Aorist wikipedia A tense in Turkic languages.  Tense=Past
pp Past participle wikipedia VerbForm=Part
pp2 Past participle (???) It's at least used in the Esperanto dictionaries for future active participles, ont (seems quite odd)
pp3 Past participle (???) It's at least used in the Esperanto dictionaries for past active participles, int (seems quite odd)
pprs Present participle Also appears as ppres (deprecated) VerbForm=Part
ger Gerund wikipedia VerbForm=Ger
supn Supine wikipedia VerbForm=Sup
pri Present indicative see also: pres. wikipedia Tense=Pres Mood=Ind
pii Imperfect from Pretério imperfecto de indicativo wikipedia  Tense=Past Mood=Ind
fti Future indicative  Tense=Fut Mood=Ind
fts Future subjunctive Tense=Fut Mood=Sub
cni Conditional Lot of pairs will probably use cnd or cond... Mood=Cnd
plu Pluperfect In cy-en Tense=Pqp
pmp Pluperfect In es-gl (from Pluscamperfecto) Tense=Pqp
prs Present subjunctive wikipedia Tense=Pres Mood=Sub
pis Imperfect subjunctive  Tense=Past Mood=Sub
ifi Past definite from Pretério perfecto o indefinido  Tense=Past Definite=Def
aff Affirmative wikipedia Polarity=Pos
itg Interrogative
neg Negative  Polarity=Neg
lp L-participle

Person

Note: person can be a sub-category tag, e.g. with pronouns.

Symbol Gloss Notes Universal feature
p1 First person Person=1
p2 Second person  Person=2
p3 Third person  Person=3
impers Impersonal Sometimes called 'autonomous'  Person=0

Derivations

Symbol Gloss Notes
caus Causative
ingr Ingressive https://nn.wikipedia.org/w/index.php?title=Ingressiv

Possession

Symbol Gloss Notes Universal feature
px1sg First person singular possessive e.g. in Turkic languages  Person[psor]=1 Number[psor]=Sing
px2sg Second person singular possessive e.g. in Turkic languages  Person[psor]=2 Number[psor]=Sing
px3sg Third person singular possessive e.g. in Turkic languages  Person[psor]=3 Number[psor]=Sing
px1pl First person plural possessive e.g. in Turkic languages  Person[psor]=1 Number[psor]=Plur
px2pl Second person plural possessive e.g. in Turkic languages  Person[psor]=2 Number[psor]=Plur
px3pl Third person plural possessive e.g. in Turkic languages  Person[psor]=3 Number[psor]=Plur
px3sp Third person possessive singular or plural e.g. in Turkic languages  Person[psor]=3

Object marking

e.g. in verbs with both

Symbol Gloss Notes Universal features
o_sg1 First person singular object
o_sg2 Second person singular object
o_sg3 Third person singular object
o_pl1 First person plural object
o_pl2 Second person plural object
o_pl3 Third person plural object

Proper nouns

Symbol Gloss Notes Universal features
ant Anthroponym wikipedia, it's very common to use ant together with f and m for traditionally gender-specific names
top Toponym In some language pairs without the locative case this may be loc. Although this should be changed. wikipedia
hyd Hydronym wikipedia
cog Cognomen In normal use, surnames
org Organisation
al Altres Other, misc.

Adjectives

Symbol Gloss Notes Universal features
pst Positive  Degree=Pos
comp Comparative wikipedia Degree=Comp
sup Superlative wikipedia Degree=Sup
attr Attributive wikipedia
pred Predicative wikipedia


Others

Symbol Gloss Notes
web Links and Emails

See also

Chunk tags

Tag Description
<SN> Noun phrase / noun group (sintagma nominal)
<SA> Adjective phrase / adjective group
<SV> Verb phrase / verb group (sintagma verbal)

XML tags

Note: All XML tags are explained in depth in the PDF documentation, see also the dix.dtd and dix.rng files in the GitHub repository.

XML tag Means Appears in XML tags / notes / examples
<dictionary> Mono- or bilingual dictionary In files apertium-eo-en.en.dix, apertium-eo-en.eo-en.dix, apertium-eo-en.post-en.dix, apertium-eo-en.post-eo.dix
<alphabet> Set of characters in the language In <dictionary>
<sdefs> Symbol definitions In <dictionary>
<sdef> Symbol definition In <sdefs>. Ex: <sdef n="noun"/>
<pardefs> Paradigm definitions In <dictionary>.
<pardef> Paradigm definition In <pardefs>.
<section> A section of the dictionary In <dictionary>. Ex: <section id="main" type="standard">
<e> A dictionary entry (a word) In <section> and in <pardef>.
<i> Invariant (left and right side) In <e>. Ex.: <i>beer</i>
<p> A pair In <e>.
<l> Left side (surface form) In <p>. Ex.: <l>beer</l>
<r> Right side (lexical unit) In <p>. Ex.: <r>beer<s n="noun"/><s n="singular"/></r>
<s> A lexical symbol (noun, adj..) In <r>, <l> and <i>. Ex.: <s n="noun"/>
<a> Post-generator wake-up mark In <r>, <l> and <i>. Ex.: <l><a/>a<s ... (for the a/an rule in English)
<b> Blank space In <r>, <l> and <i>. Ex.: <l>you're<b/>welcome<s ...

TODO: Probably there are more. --Jacob Nordfalk 14:47, 25 August 2008 (UTC)

Other tags:

<j/> (in stream format #) is to mark multiwords

<t/> and <v/> are only in crossdix
t = template, v = variable
t matches any single tag, v is like + in regexes (0 or more)

<sa/> and <prm/> are only used in metadixes.
'sa' lets you add n optional extra tag, prm is an extra string for the paradigm

Transfer

<clip> tag

See the documentation (pdf), p.144 for more information.

XML attribute value Means Appears in attribute Notes
whole lemma and grammatical symbols part
lem lemma part
lemh (inflected) head word of multiword part
lemq following queue of multiword part

See also