Difference between revisions of "Tagging guidelines for English"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
 
{{TOCD}}
 
{{TOCD}}
   
=="this"==
+
==About tagging==
  +
  +
You can think of part-of-speech tagging a bit like answering a series of multiple-choice questions. The word is the question, and the possible analyses are the answers. Unknown words can be thought of as questions we don't know what the possible answers are yet. To "tag" the text, you need to answer all of the questions by deleting the "incorrect" answers.
  +
  +
==Guidelines==
  +
  +
==="this"===
   
 
The word "this" (along with its plural "these") can be either a determiner, modifying a noun phrase, or a pronoun, replacing a noun phrase.
 
The word "this" (along with its plural "these") can be either a determiner, modifying a noun phrase, or a pronoun, replacing a noun phrase.
Line 12: Line 18:
 
** '''These''' are the ones.
 
** '''These''' are the ones.
   
=="that"==
+
==="that"===
   
 
The word "that" can be either a determiner, which modifies a noun phrase, a demonstrative pronoun which substitutes a noun phrase, a subordinating conjunction or a relative pronoun.
 
The word "that" can be either a determiner, which modifies a noun phrase, a demonstrative pronoun which substitutes a noun phrase, a subordinating conjunction or a relative pronoun.
Line 32: Line 38:
 
* '''not ok''': I think ''which'' you like cats.
 
* '''not ok''': I think ''which'' you like cats.
   
=="no"==
+
==="no"===
   
 
The word "no" in English can be a determiner, modifying a noun phrase or an adverb (or interjection).
 
The word "no" in English can be a determiner, modifying a noun phrase or an adverb (or interjection).
Line 41: Line 47:
 
** '''No'''! Don't do that!
 
** '''No'''! Don't do that!
   
== Verbs with "-ing" ==
+
=== Verbs with "-ing" ===
   
 
The ending <code>-ing</code> in English can be a gerund (adverbial), a substantive (like a noun) or a present participle (like an adjective).
 
The ending <code>-ing</code> in English can be a gerund (adverbial), a substantive (like a noun) or a present participle (like an adjective).
Line 54: Line 60:
 
*** "He is '''singing''' → "He sings"
 
*** "He is '''singing''' → "He sings"
   
== Adverb or adjective ==
+
=== Adverb or adjective ===
   
 
A word like "first" can be either an adverb, or an ordinal adjective. An adverb modifies a verb phrase, an ordinal adjective modifies a noun phrase.
 
A word like "first" can be either an adverb, or an ordinal adjective. An adverb modifies a verb phrase, an ordinal adjective modifies a noun phrase.

Revision as of 16:07, 21 November 2013

About tagging

You can think of part-of-speech tagging a bit like answering a series of multiple-choice questions. The word is the question, and the possible analyses are the answers. Unknown words can be thought of as questions we don't know what the possible answers are yet. To "tag" the text, you need to answer all of the questions by deleting the "incorrect" answers.

Guidelines

"this"

The word "this" (along with its plural "these") can be either a determiner, modifying a noun phrase, or a pronoun, replacing a noun phrase.

  • det.dem
    • I don't like this cat.
    • I don't like these cats.
  • prn
    • This is the reason.
    • These are the ones.

"that"

The word "that" can be either a determiner, which modifies a noun phrase, a demonstrative pronoun which substitutes a noun phrase, a subordinating conjunction or a relative pronoun.

  • det.dem
    • I don't like that cat.
    • I don't like those cats.
  • prn
    • That is the reason.
    • Those are the ones.
  • rel
    • These are the ones that I like.
  • cnjsub
    • I think that you like cats.

Here is a tip for distinguishing rel and cnjsub. Try substituting the word "that" for the word "which" and see how it sounds. If it sounds ok, then your "that" is probably a relative pronoun, if it sounds bad, it's probably a conjunction.

  • ok: These are the ones which I like.
  • not ok: I think which you like cats.

"no"

The word "no" in English can be a determiner, modifying a noun phrase or an adverb (or interjection).

  • det.ind
    • There are no cats in my attic.
  • adv
    • No! Don't do that!

Verbs with "-ing"

The ending -ing in English can be a gerund (adverbial), a substantive (like a noun) or a present participle (like an adjective).

  • vblex.subs:
    • Roughly, when you can substitute it with a noun: "Flying is hard" → "Flight is hard"
  • vblex.pprs:
    • Roughly, when you can substitute it with a relative clause: "The flying circus" → "The circus that flies"
  • vblex.ger
    • When it follows to be in continuous tenses, or when it can be replaced by a prepositional phrase or a different verbal phrase:
      • "He came singing" → "He came with a song"
      • "He is singing → "He sings"

Adverb or adjective

A word like "first" can be either an adverb, or an ordinal adjective. An adverb modifies a verb phrase, an ordinal adjective modifies a noun phrase.

  • adj
    • This is my first computer.
  • adv
    • First I'm going to buy a computer.