User:TommiPirinen/English tagset
Contents
RFC for English tags, eh?
Verbs (google pos: Verb)
Regular English verbs inflect in these forms: _accept_, _accepts_, _accepted_, _accepting_. Some irregular verbs have like five: _forget_, _forgets_, _forgot_, _forgotten_, _forgetting_. The verb to _be_ has bunch of forms: _be_, _am_, _are_, _is_, _was_, _were_, _been_, _being_.
The tags we are using to classify English verbs are:
* vblex: for regular verbs * vaux: auxiliary verbs; that have verb complement * vbser: verb _be_ * vbdo: verb _do_ * vbhaver: verb _have_
The morphs coming after (or lack of them) are classified with:
* inf: infinitive (as in: to _do_, to _walk_) * pri: present indicative (as in: I _do_, he _walks_) * prs: present subjunctive (as in: Let there _be_ light ; At other times it is important that we _be_ quiet.) * past: common past (as in: I _did_, he _walked_) * pis: imperfect subjunctive (as in: If I _were_ you, ...) * pp: past participle (as I've _done_, he has _walked_) * pprs: present participle * ger: gerund * subs: substantive
and potentially
* +not.adv.neg: (as in _can't_, _didn't_)
In future likely:
* transitivity
The tag sequences are as follows:
Regular verbs:
walk:walk<vblex><inf> walk:walk<vblex><pri> walk:walk<vblex><prs> walk:walk<vblex><imp> walks:walk<vblex><pri><p3><sg> walked:walk<vblex><pis> walked:walk<vblex><past> walked:walk<vblex><pp> walking:walk<vblex><subs> walking:walk<vblex><pprs> walking:walk<vblex><ger>
Irregulars:
forget:forget<vblex><inf> forget:forget<vblex><pri> forgets:forget<vblex><pri><p3><sg> forgot:forget<vblex><past> forgotten:forget<vblex><pp> forgetting:forget<vblex><ger>
Auxiliaries:
can:can<vaux><pri> could:can<vaux><past> can't:can<vaux><pri>+not<adv> cannot:can<vaux><pri>+not<adv> couldn't:can<vaux><past>+not<adv> may:may<vaux><pri> may:may<vaux><past> might:might<vaux><pri> might:might<vaux><past> must:must<vaux><pri> must:must<vaux><past> ought:ought<vaux><pri> ought:ought<vaux><past> shall:shall<vaux><pri> should:shall<vaux><past> shan't:shall<vaux><pri>+not<adv> shouldn't:shall<vaux><past>+not<adv> will:will<vaux><pri> would:will<vaux><past> won't:will<vaux><pri>+not<adv> wouldn't:will<vaux><past>+not<adv>
Verb have:
have:have<vbhaver><inf> have:have<vbhaver><pri> has:have<vbhaver><pri><p3><sg> had:have<vbhaver><past> having:have<vbhaver><ger>
Verb do:
do:do<vbdo><inf> do:do<vbdo><imp> do:do<vbdo><pri> does:do<vbdo><pri><p3><sg> did:do<vbdo><past> did:do<vbdo><pis> doing:do<vbdo><subs> doing:do<vbdo><pprs> doing:do<vbdo><ger> done:do<vbdo><pp>
Nouns (google pos: Noun)
Nouns have commonly two forms and possessives along them: _beer_, _beers_, _beer's_ , _beers'_. Some don't: ?
The tags used to classify nouns are:
* n: regular noun * np: proper noun * m: male * f: female * mf: both female and male * nt: neuter female nor male * top: place * ant: human
And also:
* cnt * unc
the suffixes are:
* sg: singular * pl: plural * gen: genitive or possessive or somehting
Regular nouns go like:
beer:beer<n><sg> beers:beer<n><pl> beer's:beer<n><sg><gen> beers':beer<n><pl><gen>
Proper nouns:
Aaron:Aaron<np><ant><m><sg> Aarons:Aaron<np><ant><m><pl> Aarons':Aaron<np><ant><m><pl><gen> Aaron's:Aaron<np><ant><m><sg><gen> Amsterdam:Amsterdam<np><top><sg> Amsterdams:Amsterdam<np><top><pl> Amsterdam's:Amsterdam<np><top><sg><gen> Amsterdams':Amsterdam<np><top><pl><gen>
Adjectives (google pos: Adj)
Adjectives mostly don't do anything, like _hairy_, but some have three forms, like: _small_, _smaller_, _smallest_. The tags used for classifying are:
* adj: for non-inflecting ones * sint: for those with three forms
the suffixes are marked with:
* comp. for comparative * sup for superlative
Like so:
small:small<adj><sint> smaller:small<adj><sint><comp> smallest:small<adj><sint><sup> hairy:hairy<adj>
Adverbs (google pos: Adv)
Adverbs are adverbs. They use the tag adv:
aboard:aboard<adv> drunk:drunk<adv> no:no<adv><neg> where:where<adv><itg> when:when<adv><itg> why:why<adv><itg>
Some have other tags too.
Pronouns (google pos: Pron)
There's a lot of different pronouns.
anybody:anybody<prn><sg> anyone:anyone<prn><sg> anything:anything<prn><sg> both:both<prn><pl> everybody:everybody<prn><sg> everyone:everyone<prn><sg> everything:everything<prn><sg> few:few<prn><pl> he:he<prn><pers><p3><m><sg> his:he<prn><pers><p3><m><sg><poss> his:he<prn><pers><p3><m><sg><gen> him:he<prn><pers><p3><m><sg><acc> herself:herself<prn><ref><p3><f><sg> himself:himself<prn><ref><p3><m><sg> hisself:himself<prn><ref><p3><m><sg> I:I<prn><pers><p1><mf><sg> me:I<prn><pers><p1><mf><sg><acc> my:I<prn><pers><p1><mf><sg><gen> mine:I<prn><pers><p1><mf><sg><poss> it:it<prn><dem><sg> its:it<prn><dem><sg><poss> itself:itself<prn><ref><p3><nt><sg> myself:myself<prn><ref><p1><mf><sg> oneself:oneself<prn><ref><p1><mf><sg> oneself:oneself<prn><ref><p3><mf><sg> one's self:oneself<prn><ref><p1><mf><sg> one's self:oneself<prn><ref><p3><mf><sg> ourself:ourselves<prn><ref><p1><mf><pl> ourselves:ourselves<prn><ref><p1><mf><pl> several:several<prn><sg> she:she<prn><pers><p3><m><sg> hers:she<prn><pers><p3><m><sg><poss> her:she<prn><pers><p3><m><sg><gen> her:she<prn><pers><p3><m><sg><acc> something:something<prn><sg> that:that<prn><rel> that:that<prn><sg> those:that<prn><pl> theirselves:themselves<prn><ref><p3><mf><pl> themself:themself<prn><ref><p3><mf><sg> themselves:themselves<prn><ref><p3><mf><sg> themselves:themselves<prn><ref><p3><mf><pl> they:they<prn><pers><p3><mf><pl> their:they<prn><pers><p3><mf><pl><gen> theirs:they<prn><pers><p3><mf><pl><poss> them:they<prn><pers><p3><m><sg><acc> this:this<prn><sg> these:this<prn><pl> thyself:thyself<prn><ref><p2><mf><sg> we:we<prn><pers><p1><mf><pl> us:we<prn><pers><p1><mf><pl><acc> our:we<prn><pers><p1><mf><pl><gen> ours:we<prn><pers><p1><mf><pl><poss> which:which<prn><itg> which:which<prn><rel> who:who<prn><itg> whose:who<prn><poss> whom:who<prn><itg><acc> you:you<prn><pers><p2><mf><sp> yours:you<prn><pers><p2><mf><sp><poss> your:you<prn><pers><p2><mf><sp><gen> you:you<prn><pers><p2><mf><sp><acc> yourself:yourself<prn><ref><p2><mf><sg> yourselves:yourselves<prn><ref><p2><mf><pl>
Determiners (Det)
There are couple of determiners:
a:>:a<det><ind><sg> an:>:a<det><ind><sg> ~a:<:a<det><ind><sg> both:both<det><qnt> many:many<det><qnt> no:no<det><ind><neg> several:several<det><dem> that:th<det><dem><sg> those:th<det><dem><pl> the:the<det><def><sp> this:th<det><dem><sg> these:th<det><dem><pl> which:which<det><itg><sp>
Prepositions (Adp)
above:above<pr> according to:according to<pr> across:across<pr> after:after<pr> against:against<pr> along:along<pr> alongside:alongside<pr> along with:along with<pr> amid:amid<pr> among:among<pr> amongst:amongst<pr> around:around<pr> as:as<pr> as of:as of<pr> at:at<pr> atop:atop<pr> because of:because of<pr> before:before<pr> behind:behind<pr> below:below<pr> between:between<pr> but:but<pr> by:by<pr> by means of:by means of<pr> despite:despite<pr> due to:due to<pr> during:during<pr> except for:except for<pr> except:except<pr> for:for<pr> from:from<pr> in contrast to:in contrast to<pr> in front of:in front of<pr> in:in<pr> in order to:in order to<pr> inside:inside<pr> into:into<pr> near:near<pr> off:off<pr> of:of<pr> on:on<pr> onto:onto<pr> out:out<pr> out of:out of<pr> outside:outside<pr> over:over<pr> per:per<pr> prior to:prior to<pr> since:since<pr> through:through<pr> throughout:throughout<pr> to:to<pr> towards:towards<pr> under:under<pr> until:until<pr> up:up<pr> upon:upon<pr> up to:up to<pr> via:via<pr> within:within<pr> with:with<pr> without:without<pr>
Numerals (Num)
There's quite a bit of number words in existence:
one:one<num><sg> one's:one<num><sg><gen> two:two<num><pl> two's:two<num><pl><gen> three:three<num><pl> three's:three<num><pl><gen> first:first<num><pl> first's:first<num><pl><gen> second:second<num><pl> second's:second<num><pl><gen> third:third<num><pl> third's:third<num><pl><gen>
Conjunctions (Conj)
Some classes for conjuncions:
albeit:albeit<cnjadv> albeit:albeit<cnjsub> although:although<cnjadv> and:and<cnjcoo> an if:an if<cnjadv> because:because<cnjadv> because:because<cnjsub> both:both<cnjcoo> but:but<cnjcoo> either:either<cnjadv> however:however<cnjadv> if:if<cnjadv> if:if<cnjsub> lest:lest<cnjadv> neither:neither<cnjcoo> nor:nor<cnjcoo> or:or<cnjcoo> since:since<cnjadv> than:than<cnjadv> than:than<cnjsub> that:that<cnjsub> then:then<cnjadv> though:though<cnjadv> til:til<cnjadv> till:till<cnjadv> unless:unless<cnjadv> until:until<cnjadv> unto:unto<cnjadv> what:what<cnjsub> whenas:whenas<cnjadv> whence:whence<cnjadv> when:when<cnjadv> wherealong:wherealong<cnjadv> whereas:whereas<cnjadv> whereat:whereat<cnjadv> wherefore:wherefore<cnjadv> whereinbefore:whereinbefore<cnjadv> wherein:wherein<cnjadv> whereof:whereof<cnjadv> whereout:whereout<cnjadv> whereover:whereover<cnjadv> wheresoever:wheresoever<cnjadv> whether:whether<cnjadv> which:which<cnjsub> while:while<cnjadv> whilst:whilst<cnjadv>
Interjections (Prt)
Punctuations (Google pos: .)
These are more or less same everywhere apart from directionality and some orthographic variation.
':'<apos> ,:,<cm> -:-<guio> --:–<guio> –:-<guio> —:—<guio> (:(<lpar> [:[<lpar> ":"<lquot> “:“<lquot> «:«<lquot> »:«<lquot> ):)<rpar> ]:]<rpar> ":"<rquot> ”:”<rquot> »:»<rquot> (:(<lpar> ):)<rpar> _:_<sent> :::<sent> ;:;<sent> !:!<sent> ?:?<sent> .:.<sent> #:#<sent> %:%<sent>