Difference between revisions of "Syntactic labels"

From Apertium
Jump to navigation Jump to search
(Link to French page)
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  +
[[Étiquettes syntaxiques|En français]]
In some language pairs, shallow syntax tags are used to improve disambiguation, or allow tighter rules to be written. For example, disambiguating verb phrase co-ordinators from noun phrase co-ordinators lets you write rules to merge two co-ordinated NPs.
 
  +
  +
{{TOCD}}
 
In some language pairs, syntactic function labels are used to improve disambiguation, or allow tighter transfer rules to be written. For example, disambiguating verb phrase co-ordinators from noun phrase co-ordinators lets you write transfer rules to merge two co-ordinated NPs.
  +
  +
Apertium processes [[LRLM|left-to-right longest match]], so if we have the following sentence,
  +
  +
* John kicked the ball and Mary caught it.
  +
  +
And we have a rule for {{sc|noun cc noun}}, then we will get the following analysis,
  +
  +
* John kicked the [ball] and [Mary]
  +
  +
But if we can tag the conjunction as being a global conjunction, then we can avoid lumping the subject of the second sentence with the object of the first sentence.
  +
  +
* [John kicked the ball] and [Mary caught it]
  +
  +
==Example==
  +
  +
<pre>
  +
$ echo "Gud talaði øll hesi orð og segði Hann:" | lt-proc fo-is.automorf.bin | cg-proc fo-is.rlx.bin
  +
^Gud/Gud<np><al><m><sg><acc><@OBJ→>/Gud<np><al><m><sg><nom><@SUBJ→>$
  +
^talaði/tala<vblex><past><p2><sg><@+FMAINV>/tala<vblex><past><p3><sg><@+FMAINV>$
  +
^øll/allur<prn><qnt><nt><pl><acc><@←OBJ>$
  +
^hesi/hesin<prn><dem><nt><pl><acc><@←OBJ>$
  +
^orð/orð<n><nt><sg><acc><ind><@←OBJ>/orð<n><nt><pl><acc><ind><@←OBJ>$
  +
^og/og<cnjcoo><@CNP>/og<cnjsub><@CVP>$
  +
^segði/siga<vblex><past><p3><sg><@+FMAINV>$
  +
^Hann/Prnpers<prn><p3><m><sg><nom><@←SUBJ>$^:/:<sent>$
  +
</pre>
  +
  +
Here, we could for example have a rule that moves subjects of a finite main verb that are to the right, to the left. e.g. <code>@+FMAINV @←SUBJ</code> to <code>@→SUBJ @+FMAINV</code> as is the order in English.
   
 
==Standard syntax tags==
 
==Standard syntax tags==
   
These are the uniform tags used in many Giellatekno projects.
+
These are the uniform tags used in many [http://giellatekno.uit.no/english.html Giellatekno] projects. It isn't necessary to implement all of the analysis, even implementing part of it can prove useful in writing transfer or lexical selection rules.
  +
  +
===Direction===
  +
  +
* If the tag indicates that the current word serves that function, then:
  +
** If the word on which the current word depends is to the left, then the arrow is on the left of the tag,
  +
*** e.g. <code>@←SUBJ</code>: A subject which depends on a verb to the left.
  +
** If the word on which the current word depends is to the right, then the arrow is on the right of the tag,
  +
*** e.g. <code>@OBJ→</code>: A direct object which depends on a verb to the right.
  +
* If the tag indicates that the current word depends on a word of that function, then:
  +
** If the word on which it depends is to the right, the tag is on the left:
  +
*** e.g. <code>@→N</code>: A noun modifier with its head to the right
  +
** If the word on which it depends is to the left, the tag is on the right:
  +
*** e.g. <code>@P←</code>: A word depending on a preposition to the left.
  +
  +
A full example:
  +
<pre>
  +
We<@SUBJ→> sang<@+FMAINV> songs<@←OBJ> about<@←ADVL> a<@→N> submarine<@P←>
  +
</pre>
  +
  +
===Table===
   
 
{|class=wikitable
 
{|class=wikitable
! Tag !! Description
+
! Tag !! Description !! Examples
 
|-
  +
| <code>@←SUBJ</code> || Subject, head verb to the left || Í upphafi skapaði '''Guð''' himinn og jörð.
 
|-
 
|-
| <code>@←SUBJ</code> || Subject, head verb to the left
+
| <code>@SUBJ→</code> || Subject, head verb to the right || '''Ég''' tala við hann.
 
|-
 
|-
| <code>@SUBJ→</code> || Subject, head verb to the right
+
| <code>@←OBJ</code> || Direct object, head verb to the left || Ég sendi þér '''bréfið'''.
 
|-
 
|-
| <code>@←OBJ</code> || Direct object, head verb to the left
+
| <code>@OBJ→</code> || Direct object, head verb to the right ||
 
|-
 
|-
| <code>@OBJ→</code> || Direct object, head verb to the right
+
| <code>@←IOBJ</code> || Indirect object, head verb to the left || Ég sendi '''þér''' bréfið.
 
|-
 
|-
| <code>@←IOBJ</code> || Indirect object, head verb to the left
+
| <code>@IOBJ→</code> || Indirect object, head verb to the right ||
 
|-
 
|-
| <code>@IOBJ→</code> || Indirect object, head verb to the right
+
| <code>@→N</code> || Noun modifier, head noun to the right || Um '''1.2 milljónir''' manna eru heimilislausar.
 
|-
 
|-
| <code>@→N</code> || Noun modifier, head noun to the right
+
| <code>@N←</code> || Noun modifier, head noun to the left || Samskipti landanna '''tveggja''' eru góð.
 
|-
 
|-
| <code>@N←</code> || Noun modifier, head noun to the left
+
| <code>@→A</code> || Adjective modifier, head noun to the right ||
 
|-
 
|-
| <code>@→A</code> || Adjective modifier, head noun to the right
+
| <code>@A←</code> || Adjective modifier, head noun to the left ||
 
|-
 
|-
| <code>@A←</code> || Adjective modifier, head noun to the left
+
| <code>@IM</code> || ||
 
|-
 
|-
| <code>@IM</code> ||
+
| <code>@SPRED</code> || Subject predicate, ||
 
|-
 
|-
| <code>@SPRED</code> || Subject predicate
+
| <code>@←SPRED</code> || Subject predicate, || She is my '''sister'''
 
|-
 
|-
| <code>@←SPRED</code> || Subject predicate, head verb
+
| <code>@SPRED→</code> || Subject predicate, || '''Blár''' er himinninn.
 
|-
 
|-
| <code>@SPRED→</code> ||
+
| <code>@OPRED</code> || Object predicate, ||
 
|-
 
|-
| <code>@OPRED</code> ||
+
| <code>@←OPRED</code> || Object predicate, || I will make you my personal '''slave'''
 
|-
 
|-
| <code>@←OPRED</code> ||
+
| <code>@OPRED→</code> || Object predicate, ||
 
|-
 
|-
| <code>@OPRED→</code> ||
+
| <code>@+FAUXV</code> || Finite auxiliary verb ||
 
|-
 
|-
| <code>@+FAUXV</code> || Finite auxiliary verb
+
| <code>@-FAUXV</code> || Non-finite auxiliary verb ||
 
|-
 
|-
| <code>@-FAUXV</code> || Non-finite auxiliary verb
+
| <code>@+FMAINV</code> || Finite main verb ||
 
|-
 
|-
| <code>@+FMAINV</code> || Finite main verb
+
| <code>@-FMAINV</code> || Non-finite main verb ||
 
|-
 
|-
| <code>@-FMAINV</code> || Non-finite main verb
 
 
|-
 
|-
  +
| <code>@-FSUBJ→</code> || Subject of a non-finite verb ||
 
|-
 
|-
| <code>@-FSUBJ→</code> ||
+
| <code>@-F←OBJ</code> || Object of a non-finite verb ||
 
|-
 
|-
| <code>@-F←OBJ</code> ||
+
| <code>@-FOBJ→</code> || Object of a non-finite verb ||
 
|-
 
|-
| <code>@-FOBJ→</code> ||
+
| <code>@SPRED←OBJ</code> || ||
 
|-
  +
| <code>@-FADVL</code> || ||
 
|-
 
|-
| <code>@SPRED←OBJ</code> ||
 
 
|-
 
|-
| <code>@-FADVL</code> ||
+
| <code>@←ADVL</code> || Adverbial modifier, head to the left ||
 
|-
 
|-
  +
| <code>@ADVL→</code> || Adverbial modifier, head to the right ||
 
|-
 
|-
| <code>@←ADVL</code> || Adverbial modifier, head to the left
+
| <code>@ADVL</code> || Adverbial modifier ||
 
|-
 
|-
| <code>@ADVL→</code> || Adverbial modifier, head to the right
+
| <code>@P←</code> || Complement of a preposition ||
 
|-
 
|-
| <code>@ADVL</code> || Adverbial modifier
+
| <code>@CNP</code> || Local conjunction or subjunction ||
 
|-
 
|-
| <code>@P←</code> || Complement of a preposition
+
| <code>@CVP</code> || Conjunction or subjunction that joins finite-verb phrases ||
 
|-
 
|-
| <code>@CNP</code> || Local conjunction or subjunction
+
| <code>@→CS</code> || ||
 
|-
 
|-
| <code>@CVP</code> || Conjunction or subjunction that joins finite-verb phrases
+
| <code>@CNP-VP</code> || Ambiguous co-ordinator ||
 
|-
 
|-
| <code>@→CS</code> ||
+
| <code>@APP</code> || Apposition ||
 
|-
 
|-
 
|-
| <code>@CNP-VP</code> || Ambiguous co-ordinator
 
 
| <code>@ICL-ADVL</code> || Non-finite subclause ... ||
 
|-
  +
| <code>@ICL-AUX←</code> || "right" argument of auxiliary (?) ||
 
|-
 
|-
| <code>@APP</code> ||
+
| <code>@ICL-OBJ</code> || Non-finite subclause ... ||
 
|-
 
|-
 
| <code>@ICL-STA</code> || Non-finite subclause ... ||
 
|-
 
|-
| <code>@IMV</code> || Infinite main verb
+
| <code>@HNOUN</code> || Noun phrase fragment ||
 
|-
 
|-
| <code>@ICL-ADVL</code> || Non-finite subclause ...
 
 
|-
 
|-
| <code>@ICL-AUX←</code> || "right" argument of auxiliary (?)
+
| <code>@X</code> || No analysis ||
|-
 
| <code>@ICL-OBJ</code> || Non-finite subclause ...
 
|-
 
| <code>@ICL-STA</code> || Non-finite subclause ...
 
|-
 
| <code>@HNOUN</code> || Noun phrase fragment
 
|-
 
|-
 
| <code>@X</code> || No analysis
 
 
|-
 
|-
 
|}
 
|}
   
===External links===
+
==See also==
  +
* [[List of symbols]] (Morphology/POS tags)
* [http://giellatekno.uit.no/doc/lang/sme/docu-sme-syntaxtags.html Syntax tags used in Sámi] at giellatekno.uit.no
 
  +
  +
==External links==
 
* [http://giellatekno.uit.no/doc/lang/common/docu-sme-syntaxtags.html Syntax tags used in Sámi] at giellatekno.uit.no
   
 
[[Category:Documentation]]
 
[[Category:Documentation]]
  +
[[Category:Documentation in English]]

Latest revision as of 08:16, 8 October 2014

En français

In some language pairs, syntactic function labels are used to improve disambiguation, or allow tighter transfer rules to be written. For example, disambiguating verb phrase co-ordinators from noun phrase co-ordinators lets you write transfer rules to merge two co-ordinated NPs.

Apertium processes left-to-right longest match, so if we have the following sentence,

  • John kicked the ball and Mary caught it.

And we have a rule for noun cc noun, then we will get the following analysis,

  • John kicked the [ball] and [Mary]

But if we can tag the conjunction as being a global conjunction, then we can avoid lumping the subject of the second sentence with the object of the first sentence.

  • [John kicked the ball] and [Mary caught it]

Example[edit]

$ echo "Gud talaði øll hesi orð og segði Hann:" | lt-proc fo-is.automorf.bin | cg-proc fo-is.rlx.bin 
  ^Gud/Gud<np><al><m><sg><acc><@OBJ→>/Gud<np><al><m><sg><nom><@SUBJ→>$ 
  ^talaði/tala<vblex><past><p2><sg><@+FMAINV>/tala<vblex><past><p3><sg><@+FMAINV>$ 
  ^øll/allur<prn><qnt><nt><pl><acc><@←OBJ>$ 
  ^hesi/hesin<prn><dem><nt><pl><acc><@←OBJ>$ 
  ^orð/orð<n><nt><sg><acc><ind><@←OBJ>/orð<n><nt><pl><acc><ind><@←OBJ>$ 
  ^og/og<cnjcoo><@CNP>/og<cnjsub><@CVP>$ 
  ^segði/siga<vblex><past><p3><sg><@+FMAINV>$ 
  ^Hann/Prnpers<prn><p3><m><sg><nom><@←SUBJ>$^:/:<sent>$

Here, we could for example have a rule that moves subjects of a finite main verb that are to the right, to the left. e.g. @+FMAINV @←SUBJ to @→SUBJ @+FMAINV as is the order in English.

Standard syntax tags[edit]

These are the uniform tags used in many Giellatekno projects. It isn't necessary to implement all of the analysis, even implementing part of it can prove useful in writing transfer or lexical selection rules.

Direction[edit]

  • If the tag indicates that the current word serves that function, then:
    • If the word on which the current word depends is to the left, then the arrow is on the left of the tag,
      • e.g. @←SUBJ: A subject which depends on a verb to the left.
    • If the word on which the current word depends is to the right, then the arrow is on the right of the tag,
      • e.g. @OBJ→: A direct object which depends on a verb to the right.
  • If the tag indicates that the current word depends on a word of that function, then:
    • If the word on which it depends is to the right, the tag is on the left:
      • e.g. @→N: A noun modifier with its head to the right
    • If the word on which it depends is to the left, the tag is on the right:
      • e.g. @P←: A word depending on a preposition to the left.

A full example:

We<@SUBJ→> sang<@+FMAINV> songs<@←OBJ> about<@←ADVL> a<@→N> submarine<@P←>

Table[edit]

Tag Description Examples
@←SUBJ Subject, head verb to the left Í upphafi skapaði Guð himinn og jörð.
@SUBJ→ Subject, head verb to the right Ég tala við hann.
@←OBJ Direct object, head verb to the left Ég sendi þér bréfið.
@OBJ→ Direct object, head verb to the right
@←IOBJ Indirect object, head verb to the left Ég sendi þér bréfið.
@IOBJ→ Indirect object, head verb to the right
@→N Noun modifier, head noun to the right Um 1.2 milljónir manna eru heimilislausar.
@N← Noun modifier, head noun to the left Samskipti landanna tveggja eru góð.
@→A Adjective modifier, head noun to the right
@A← Adjective modifier, head noun to the left
@IM
@SPRED Subject predicate,
@←SPRED Subject predicate, She is my sister
@SPRED→ Subject predicate, Blár er himinninn.
@OPRED Object predicate,
@←OPRED Object predicate, I will make you my personal slave
@OPRED→ Object predicate,
@+FAUXV Finite auxiliary verb
@-FAUXV Non-finite auxiliary verb
@+FMAINV Finite main verb
@-FMAINV Non-finite main verb
@-FSUBJ→ Subject of a non-finite verb
@-F←OBJ Object of a non-finite verb
@-FOBJ→ Object of a non-finite verb
@SPRED←OBJ
@-FADVL
@←ADVL Adverbial modifier, head to the left
@ADVL→ Adverbial modifier, head to the right
@ADVL Adverbial modifier
@P← Complement of a preposition
@CNP Local conjunction or subjunction
@CVP Conjunction or subjunction that joins finite-verb phrases
@→CS
@CNP-VP Ambiguous co-ordinator
@APP Apposition
@ICL-ADVL Non-finite subclause ...
@ICL-AUX← "right" argument of auxiliary (?)
@ICL-OBJ Non-finite subclause ...
@ICL-STA Non-finite subclause ...
@HNOUN Noun phrase fragment
@X No analysis

See also[edit]

External links[edit]