Tagging guidelines for English

From Apertium
Jump to navigation Jump to search

About tagging

You can think of part-of-speech tagging a bit like answering a series of multiple-choice questions. The word is the question, and the possible analyses are the answers. Unknown words can be thought of as questions we don't know what the possible answers are yet. To "tag" the text, you need to answer all of the questions by deleting the "incorrect" answers.

Guidelines

"both"

The word "both" can be a conjunction, joining two noun phrases, a determiner, modifying a noun phrase, and a pronoun, substituting a noun phrase.

  • cnjcoo
    • I like both cats and dogs.
  • det
    • Both children like playing in the garden.
  • prn
    • Both thought it a good idea.
    • They both like playing in the garden.
    • Both of them like playing in the garden.

"this"

The word "this" (along with its plural "these") can be either a determiner, modifying a noun phrase, or a pronoun, replacing a noun phrase.

  • det.dem
    • I don't like this cat.
    • I don't like these cats.
  • prn
    • This is the reason.
    • These are the ones.

"that"

The word "that" can be either a determiner, which modifies a noun phrase, a demonstrative pronoun which substitutes a noun phrase, a subordinating conjunction or a relative pronoun.

  • det.dem
    • I don't like that cat.
    • I don't like those cats.
  • prn
    • That is the reason.
    • Those are the ones.
  • rel
    • These are the ones that I like.
  • cnjsub
    • I think that you like cats.

Here is a tip for distinguishing rel and cnjsub. Try substituting the word "that" for the word "which" and see how it sounds. If it sounds ok, then your "that" is probably a relative pronoun, if it sounds bad, it's probably a conjunction.

  • ok: These are the ones which I like.
  • not ok: I think which you like cats.

"no"

The word "no" in English can be a determiner, modifying a noun phrase or an adverb (or interjection).

  • det.ind
    • There are no cats in my attic.
  • adv
    • No! Don't do that!

Verbs with "-ing"

The ending -ing in English can be a gerund (adverbial), a substantive (like a noun) or a present participle (like an adjective).

  • vblex.subs:
    • Roughly, when you can substitute it with a noun: "Flying is hard" → "Flight is hard"
  • vblex.pprs:
    • Roughly, when you can substitute it with a relative clause: "The flying circus" → "The circus that flies"
  • vblex.ger
    • When it follows to be in continuous tenses, or when it can be replaced by a prepositional phrase or a different verbal phrase:
      • "He came singing" → "He came with a song"
      • "He is singing → "He sings"

Adverb or adjective

A word like "first" can be either an adverb, or an ordinal adjective. An adverb modifies a verb phrase, an ordinal adjective modifies a noun phrase.

  • adj
    • This is my first computer.
  • adv
    • First I'm going to buy a computer.

Past tense or past participle

Many verbs ending in -ed (worked) may be past tense and past participle. A trick: change the verb to a form of go or drink. If you would have went or drank, then it is past tense (past); if you would have gone or drunk, then it is a past participle (pp).