Difference between revisions of "User:TommiPirinen/English tagset"
TommiPirinen (talk | contribs) (updated xamples) |
|||
Line 3: | Line 3: | ||
== Verbs (google pos: <span style="font-variant: small-caps">Verb</span>)== |
== Verbs (google pos: <span style="font-variant: small-caps">Verb</span>)== |
||
+ | Regular English verbs inflect in these forms: _accept_, _accepts_, _accepted_, _accepting_. Some irregular verbs have like five: _forget_, _forgets_, _forgot_, _forgotten_, _forgetting_. The verb to _be_ has bunch of forms: _be_, _am_, _are_, _is_, _was_, _were_, _been_, _being_. |
||
− | <nowiki>> be |
||
− | be be<vblex><actv><pres> 0,000000 |
||
− | be be<vblex><inf> 0,000000 |
||
− | be be<vblex><inf> 0,000000 |
||
+ | The tags we are using to classify English verbs are: |
||
− | > am |
||
− | am be<vblex><actv><pres><p1><sg> 0,000000 |
||
+ | * vblex: for regular verbs |
||
− | > is |
||
+ | * vaux: auxiliary verbs; that have verb complement |
||
− | is be<vblex><actv><pres><p3><sg> 0,000000 |
||
+ | * vbser: verb _be_ |
||
+ | * vbdo: verb _do_ |
||
+ | * vbhaver: verb _have_ |
||
+ | The morphs coming after (or lack of them) are classified with: |
||
− | > are |
||
− | are are<n><sg><nom> 0,000000 |
||
− | are be<vblex><actv><pres><p1><pl> 0,000000 |
||
− | are be<vblex><actv><pres><p2><pl> 0,000000 |
||
− | are be<vblex><actv><pres><p2><sg> 0,000000 |
||
− | are be<vblex><actv><pres><p3><pl> 0,000000 |
||
+ | * inf: infinitive (as in: to _do_, to _walk_) |
||
− | > was |
||
+ | * pri: present indicative (as in: I _do_, he _walks_) |
||
− | was be<vblex><actv><past><p1><sg> 0,000000 |
||
+ | * prs: present subjunctive |
||
− | was be<vblex><actv><past><p3><sg> 0,000000 |
||
+ | * past: common past (as in: I _did_, he _walked_) |
||
+ | * pis: imperfect subjunctive |
||
+ | * pp: past participle (as I've _done_, he has _walked_) |
||
+ | * pprs: present participle |
||
+ | * ger: gerund |
||
+ | * subs: substantive |
||
+ | and potentially |
||
− | > were |
||
− | were be<vblex><actv><past><p1><pl> 0,000000 |
||
− | were be<vblex><actv><past><p2><pl> 0,000000 |
||
− | were be<vblex><actv><past><p2><sg> 0,000000 |
||
− | were be<vblex><actv><past><p3><pl> 0,000000 |
||
+ | * +not.adv.neg: (as in _can't_, _didn't_) |
||
− | > being |
||
− | being be<vblex><actv><ger> 0,000000 |
||
− | being be<vblex><actv><ger> 0,000000 |
||
− | being be<vblex><subst><sg><nom> 0,000000 |
||
− | [...] |
||
+ | In future likely: |
||
− | > been |
||
− | been be<vblex><pp> 0,000000 |
||
+ | * transitivity |
||
− | > walk |
||
− | [...] |
||
− | walk walk<vblex><actv><pres> 0,000000 |
||
− | walk walk<vblex><inf> 0,000000 |
||
+ | The tag sequences are as follows: |
||
− | > walks |
||
− | [...] |
||
− | walks walk<vblex><actv><pres><p3><sg> 0,000000 |
||
+ | Regular verbs: |
||
− | > walked |
||
− | walked walk<vblex><actv><past> 0,000000 |
||
+ | <nowiki> |
||
− | > walking |
||
− | + | walk:walk<vblex><inf> |
|
− | + | walk:walk<vblex><pri> |
|
+ | walk:walk<vblex><prs> |
||
− | [...] |
||
+ | walk:walk<vblex><imp> |
||
+ | walks:walk<vblex><pri><p3><sg> |
||
+ | walked:walk<vblex><pis> |
||
+ | walked:walk<vblex><past> |
||
+ | walked:walk<vblex><pp> |
||
+ | walking:walk<vblex><subs> |
||
+ | walking:walk<vblex><pprs> |
||
+ | walking:walk<vblex><ger> |
||
+ | </nowiki> |
||
+ | Irregulars: |
||
− | > must |
||
− | must must<vaux> 0,000000 |
||
− | must must<vaux><actv><pres> 0,000000 |
||
+ | <nowiki> |
||
− | > shall |
||
+ | forget:forget<vblex><inf> |
||
− | shall shall<vaux><actv><pres> 0,000000 |
||
+ | forget:forget<vblex><pri> |
||
+ | forgets:forget<vblex><pri><p3><sg> |
||
+ | forgot:forget<vblex><past> |
||
+ | forgotten:forget<vblex><pp> |
||
+ | forgetting:forget<vblex><ger> |
||
+ | </nowiki> |
||
+ | Auxiliaries: |
||
− | > should |
||
− | should should<vaux><actv><pres> 0,000000 |
||
+ | <nowiki> |
||
− | > can |
||
− | can |
+ | can:can<vaux><pri> |
− | + | could:can<vaux><past> |
|
− | can |
+ | can't:can<vaux><pri>+not<adv> |
+ | cannot:can<vaux><pri>+not<adv> |
||
− | can can<vblex><inf> 0,000000 |
||
+ | couldn't:can<vaux><past>+not<adv> |
||
+ | may:may<vaux><pri> |
||
+ | may:may<vaux><past> |
||
+ | might:might<vaux><pri> |
||
+ | might:might<vaux><past> |
||
+ | must:must<vaux><pri> |
||
+ | must:must<vaux><past> |
||
+ | ought:ought<vaux><pri> |
||
+ | ought:ought<vaux><past> |
||
+ | shall:shall<vaux><pri> |
||
+ | should:shall<vaux><past> |
||
+ | shan't:shall<vaux><pri>+not<adv> |
||
+ | shouldn't:shall<vaux><past>+not<adv> |
||
+ | will:will<vaux><pri> |
||
+ | would:will<vaux><past> |
||
+ | won't:will<vaux><pri>+not<adv> |
||
+ | wouldn't:will<vaux><past>+not<adv> |
||
+ | </nowiki> |
||
+ | Verb have: |
||
− | > can't |
||
− | can't can<vaux><actv><pres>+not<adv> 0,000000</nowiki> |
||
+ | <nowiki> |
||
+ | have:have<vbhaver><inf> |
||
+ | have:have<vbhaver><pri> |
||
+ | has:have<vbhaver><pri><p3><sg> |
||
+ | had:have<vbhaver><past> |
||
+ | having:have<vbhaver><ger> |
||
+ | </nowiki> |
||
+ | Verb do: |
||
− | Alternatively: |
||
− | < |
+ | <nowiki> |
+ | do:do<vbdo><inf> |
||
+ | do:do<vbdo><imp> |
||
+ | do:do<vbdo><pri> |
||
+ | does:do<vbdo><pri><p3><sg> |
||
+ | did:do<vbdo><past> |
||
+ | did:do<vbdo><pis> |
||
+ | doing:do<vbdo><subs> |
||
+ | doing:do<vbdo><pprs> |
||
+ | doing:do<vbdo><ger> |
||
+ | done:do<vbdo><pp> |
||
+ | </nowiki> |
||
− | accept:accept<vblex><inf> |
||
− | accept:accept<vblex><pri> |
||
− | accepts:accept<vblex><pri><p3><sg> |
||
− | accept:accept<vblex><prs> |
||
− | accepted:accept<vblex><past> |
||
− | accepted:accept<vblex><pis> |
||
− | accepting:accept<vblex><subs> |
||
− | accepting:accept<vblex><pprs> |
||
− | accepting:accept<vblex><ger> |
||
− | accepted:accept<vblex><pp> |
||
− | accept:accept<vblex><imp> |
||
+ | == Nouns (google pos: <span style="font-variant: small-caps">Noun</span>) == |
||
− | </pre> |
||
+ | Nouns have commonly two forms and possessives along them: _beer_, _beers_, _beer's_ , _beers'_. |
||
− | <pre> |
||
+ | Some don't: ? |
||
− | be:be<vbser><inf> |
||
− | am:be<vbser><pri><p1><sg> |
||
− | are:be<vbser><pri> |
||
− | is:be<vbser><pri><p3><sg> |
||
− | be:be<vbser><prs> |
||
− | was:be<vbser><past><p1><sg> |
||
− | were:be<vbser><past> |
||
− | were:be<vbser><pis> |
||
− | was:be<vbser><past><p3><sg> |
||
− | being:be<vbser><subs> |
||
− | being:be<vbser><pprs> |
||
− | being:be<vbser><ger> |
||
− | been:be<vbser><pp> |
||
− | be:be<vbser><imp> |
||
− | </pre> |
||
+ | The tags used to classify nouns are: |
||
− | == Nouns (google pos: <span style="font-variant: small-caps">Noun</span>) == |
||
+ | * n: regular noun |
||
− | <nowiki>> beer |
||
+ | * np: proper noun |
||
− | beer beer<n><sg><nom> 0,000000 |
||
+ | * m: male |
||
+ | * f: female |
||
+ | * mf: both female and male |
||
+ | * nt: neuter female nor male |
||
+ | * top: place |
||
+ | * ant: human |
||
+ | And also: |
||
− | > beers |
||
− | beers beer<n><pl><nom> 0,000000 |
||
+ | * cnt |
||
− | > beer's |
||
+ | * unc |
||
− | beer's beer<n><sg><gen> 0,000000 |
||
+ | the suffixes are: |
||
− | > beers' |
||
− | beers' beer<n><pl><gen> 0,000000</nowiki> |
||
+ | * sg: singular |
||
− | == Adjectives (google pos: <span style="font-variant: small-caps">Adj</span>) == |
||
+ | * pl: plural |
||
+ | * gen: genitive or possessive or somehting |
||
− | <nowiki>> small |
||
− | small small<adj><sint> 0,000000 |
||
− | > smaller |
||
− | smaller small<adj><sint><comp> 0,000000 |
||
+ | Regular nouns go like: |
||
− | > smallest |
||
− | smallest small<adj><sint><sup> 0,000000 |
||
+ | <nowiki> |
||
− | > hairy |
||
+ | beer:beer<n><sg> |
||
− | hairy hairy<adj> 0,000000 |
||
+ | beers:beer<n><pl> |
||
+ | beer's:beer<n><sg><gen> |
||
+ | beers':beer<n><pl><gen> |
||
+ | </nowiki> |
||
+ | Proper nouns: |
||
− | > hairier |
||
− | hairier hairier+? inf |
||
+ | <nowiki>Aaron:Aaron<np><ant><m><sg> |
||
− | > hairiest |
||
+ | Aarons:Aaron<np><ant><m><pl> |
||
− | hairiest hairiest+? inf</nowiki> |
||
+ | Aarons':Aaron<np><ant><m><pl><gen> |
||
+ | Aaron's:Aaron<np><ant><m><sg><gen> |
||
+ | Amsterdam:Amsterdam<np><top><sg> |
||
+ | Amsterdams:Amsterdam<np><top><pl> |
||
+ | Amsterdam's:Amsterdam<np><top><sg><gen> |
||
+ | Amsterdams':Amsterdam<np><top><pl><gen> |
||
+ | </nowiki> |
||
− | == |
+ | == Adjectives (google pos: <span style="font-variant: small-caps">Adj</span>) == |
+ | Adjectives mostly don't do anything, like _hairy_, but some have three forms, like: |
||
− | <nowiki> smoothly |
||
+ | _small_, _smaller_, _smallest_. The tags used for classifying are: |
||
− | smoothly smoothly<adv> 0,000000 |
||
+ | * adj: for non-inflecting ones |
||
− | > aboard |
||
+ | * sint: for those with three forms |
||
− | aboard aboard<adv> 0,000000 |
||
+ | the suffixes are marked with: |
||
− | > drunk |
||
− | drunk drunk<adj> 0,000000 |
||
− | drunk drunk<adv> 0,000000 |
||
− | drunk drunk<n><sg><nom> 0,000000</nowiki> |
||
+ | * comp. for comparative |
||
− | Like why are these three in anyy imaginable way in same class? |
||
+ | * sup for superlative |
||
+ | Like so: |
||
− | == Pronouns (google pos: <span style="font-variant: small-caps">Pron</span>) == |
||
− | <nowiki>> |
+ | <nowiki>small:small<adj><sint> |
+ | smaller:small<adj><sint><comp> |
||
− | I I<prn><pers><p1><mf><sg><nom> 0,000000 |
||
+ | smallest:small<adj><sint><sup> |
||
+ | hairy:hairy<adj> |
||
+ | </nowiki> |
||
+ | == Adverbs (google pos: <span style="font-variant: small-caps">Adv</span>) == |
||
− | > me |
||
− | me I<prn><pers><p1><mf><sg><acc> 0,000000 |
||
+ | Adverbs are adverbs. They use the tag adv: |
||
− | > my |
||
− | my I<prn><pers><p1><mf><sg><gen> 0,000000 |
||
+ | <nowiki>aboard:aboard<adv> |
||
− | > mine |
||
+ | drunk:drunk<adv> |
||
− | mine I<prn><pers><p1><mf><sg><acc> 0,000000 |
||
+ | no:no<adv><neg> |
||
+ | where:where<adv><itg> |
||
+ | when:when<adv><itg> |
||
+ | why:why<adv><itg> |
||
+ | </nowiki> |
||
+ | Some have other tags too. |
||
− | > those |
||
− | those those<prn><dem><pl><acc> 0,000000 |
||
+ | == Pronouns (google pos: <span style="font-variant: small-caps">Pron</span>) == |
||
− | > something |
||
− | something something<prn><sg><nom> 0,000000 |
||
+ | There's a lot of different pronouns. |
||
− | > both |
||
− | both both<prn><ind><mf><pl><nom> 0,000000</nowiki> |
||
+ | <nowiki>anybody:anybody<prn><sg> |
||
− | There's lots of stuff -_- |
||
+ | anyone:anyone<prn><sg> |
||
+ | anything:anything<prn><sg> |
||
+ | both:both<prn><pl> |
||
+ | everybody:everybody<prn><sg> |
||
+ | everyone:everyone<prn><sg> |
||
+ | everything:everything<prn><sg> |
||
+ | few:few<prn><pl> |
||
+ | he:he<prn><pers><p3><m><sg> |
||
+ | his:he<prn><pers><p3><m><sg><poss> |
||
+ | his:he<prn><pers><p3><m><sg><gen> |
||
+ | him:he<prn><pers><p3><m><sg><acc> |
||
+ | herself:herself<prn><ref><p3><f><sg> |
||
+ | himself:himself<prn><ref><p3><m><sg> |
||
+ | hisself:himself<prn><ref><p3><m><sg> |
||
+ | I:I<prn><pers><p1><mf><sg> |
||
+ | me:I<prn><pers><p1><mf><sg><acc> |
||
+ | my:I<prn><pers><p1><mf><sg><gen> |
||
+ | mine:I<prn><pers><p1><mf><sg><poss> |
||
+ | it:it<prn><dem><sg> |
||
+ | its:it<prn><dem><sg><poss> |
||
+ | itself:itself<prn><ref><p3><nt><sg> |
||
+ | myself:myself<prn><ref><p1><mf><sg> |
||
+ | oneself:oneself<prn><ref><p1><mf><sg> |
||
+ | oneself:oneself<prn><ref><p3><mf><sg> |
||
+ | one's self:oneself<prn><ref><p1><mf><sg> |
||
+ | one's self:oneself<prn><ref><p3><mf><sg> |
||
+ | ourself:ourselves<prn><ref><p1><mf><pl> |
||
+ | ourselves:ourselves<prn><ref><p1><mf><pl> |
||
+ | several:several<prn><sg> |
||
+ | she:she<prn><pers><p3><m><sg> |
||
+ | hers:she<prn><pers><p3><m><sg><poss> |
||
+ | her:she<prn><pers><p3><m><sg><gen> |
||
+ | her:she<prn><pers><p3><m><sg><acc> |
||
+ | something:something<prn><sg> |
||
+ | that:that<prn><rel> |
||
+ | that:that<prn><sg> |
||
+ | those:that<prn><pl> |
||
+ | theirselves:themselves<prn><ref><p3><mf><pl> |
||
+ | themself:themself<prn><ref><p3><mf><sg> |
||
+ | themselves:themselves<prn><ref><p3><mf><sg> |
||
+ | themselves:themselves<prn><ref><p3><mf><pl> |
||
+ | they:they<prn><pers><p3><mf><pl> |
||
+ | their:they<prn><pers><p3><mf><pl><gen> |
||
+ | theirs:they<prn><pers><p3><mf><pl><poss> |
||
+ | them:they<prn><pers><p3><m><sg><acc> |
||
+ | this:this<prn><sg> |
||
+ | these:this<prn><pl> |
||
+ | thyself:thyself<prn><ref><p2><mf><sg> |
||
+ | we:we<prn><pers><p1><mf><pl> |
||
+ | us:we<prn><pers><p1><mf><pl><acc> |
||
+ | our:we<prn><pers><p1><mf><pl><gen> |
||
+ | ours:we<prn><pers><p1><mf><pl><poss> |
||
+ | which:which<prn><itg> |
||
+ | which:which<prn><rel> |
||
+ | who:who<prn><itg> |
||
+ | whose:who<prn><poss> |
||
+ | whom:who<prn><itg><acc> |
||
+ | you:you<prn><pers><p2><mf><sp> |
||
+ | yours:you<prn><pers><p2><mf><sp><poss> |
||
+ | your:you<prn><pers><p2><mf><sp><gen> |
||
+ | you:you<prn><pers><p2><mf><sp><acc> |
||
+ | yourself:yourself<prn><ref><p2><mf><sg> |
||
+ | yourselves:yourselves<prn><ref><p2><mf><pl> |
||
+ | </nowiki> |
||
== Determiners (<span style="font-variant: small-caps">Det</span>) == |
== Determiners (<span style="font-variant: small-caps">Det</span>) == |
||
+ | There are couple of determiners: |
||
− | <nowiki>> |
+ | <nowiki>a:>:a<det><ind><sg> |
− | + | an:>:a<det><ind><sg> |
|
+ | ~a:<:a<det><ind><sg> |
||
+ | both:both<det><qnt> |
||
+ | many:many<det><qnt> |
||
+ | no:no<det><ind><neg> |
||
+ | several:several<det><dem> |
||
+ | that:th<det><dem><sg> |
||
+ | those:th<det><dem><pl> |
||
+ | the:the<det><def><sp> |
||
+ | this:th<det><dem><sg> |
||
+ | these:th<det><dem><pl> |
||
+ | which:which<det><itg><sp> |
||
+ | </nowiki> |
||
− | > the |
||
− | the the<det><def> 0,000000</nowiki> |
||
− | |||
− | Maybe few more? Definition? |
||
== Prepositions (<span style="font-variant: small-caps">Adp</span>) == |
== Prepositions (<span style="font-variant: small-caps">Adp</span>) == |
||
− | <nowiki>> |
+ | <nowiki>above:above<pr> |
+ | according to:according to<pr> |
||
− | in in<pr> 0,000000</nowiki> |
||
+ | across:across<pr> |
||
+ | after:after<pr> |
||
+ | against:against<pr> |
||
+ | along:along<pr> |
||
+ | alongside:alongside<pr> |
||
+ | along with:along with<pr> |
||
+ | amid:amid<pr> |
||
+ | among:among<pr> |
||
+ | amongst:amongst<pr> |
||
+ | around:around<pr> |
||
+ | as:as<pr> |
||
+ | as of:as of<pr> |
||
+ | at:at<pr> |
||
+ | atop:atop<pr> |
||
+ | because of:because of<pr> |
||
+ | before:before<pr> |
||
+ | behind:behind<pr> |
||
+ | below:below<pr> |
||
+ | between:between<pr> |
||
+ | but:but<pr> |
||
+ | by:by<pr> |
||
+ | by means of:by means of<pr> |
||
+ | despite:despite<pr> |
||
+ | due to:due to<pr> |
||
+ | during:during<pr> |
||
+ | except for:except for<pr> |
||
+ | except:except<pr> |
||
+ | for:for<pr> |
||
+ | from:from<pr> |
||
+ | in contrast to:in contrast to<pr> |
||
+ | in front of:in front of<pr> |
||
+ | in:in<pr> |
||
+ | in order to:in order to<pr> |
||
+ | inside:inside<pr> |
||
+ | into:into<pr> |
||
+ | near:near<pr> |
||
+ | off:off<pr> |
||
+ | of:of<pr> |
||
+ | on:on<pr> |
||
+ | onto:onto<pr> |
||
+ | out:out<pr> |
||
+ | out of:out of<pr> |
||
+ | outside:outside<pr> |
||
+ | over:over<pr> |
||
+ | per:per<pr> |
||
+ | prior to:prior to<pr> |
||
+ | since:since<pr> |
||
+ | through:through<pr> |
||
+ | throughout:throughout<pr> |
||
+ | to:to<pr> |
||
+ | towards:towards<pr> |
||
+ | under:under<pr> |
||
+ | until:until<pr> |
||
+ | up:up<pr> |
||
+ | upon:upon<pr> |
||
+ | up to:up to<pr> |
||
+ | via:via<pr> |
||
+ | within:within<pr> |
||
+ | with:with<pr> |
||
+ | without:without<pr></nowiki> |
||
== Numerals (<span style="font-variant: small-caps">Num</span>) == |
== Numerals (<span style="font-variant: small-caps">Num</span>) == |
||
+ | There's quite a bit of number words in existence: |
||
− | <nowiki>> one |
||
− | one |
+ | <nowiki>one:one<num><sg> |
+ | one's:one<num><sg><gen> |
||
+ | two:two<num><pl> |
||
+ | two's:two<num><pl><gen> |
||
+ | three:three<num><pl> |
||
+ | three's:three<num><pl><gen> |
||
+ | first:first<num><pl> |
||
+ | first's:first<num><pl><gen> |
||
+ | second:second<num><pl> |
||
+ | second's:second<num><pl><gen> |
||
+ | third:third<num><pl> |
||
+ | third's:third<num><pl><gen> |
||
+ | </nowiki> |
||
+ | == Conjunctions (<span style="font-variant: small-caps">Conj</span>) == |
||
− | > first |
||
− | first first<num> 0,000000 |
||
+ | Some classes for conjuncions: |
||
− | > 1 |
||
− | 1 1<num> 0,000000 |
||
+ | <nowiki>albeit:albeit<cnjadv> |
||
− | > 1. |
||
+ | albeit:albeit<cnjsub> |
||
− | 1. 1.<num> 0,000000</nowiki> |
||
+ | although:although<cnjadv> |
||
+ | and:and<cnjcoo> |
||
+ | an if:an if<cnjadv> |
||
+ | because:because<cnjadv> |
||
+ | because:because<cnjsub> |
||
+ | both:both<cnjcoo> |
||
+ | but:but<cnjcoo> |
||
+ | either:either<cnjadv> |
||
+ | however:however<cnjadv> |
||
+ | if:if<cnjadv> |
||
+ | if:if<cnjsub> |
||
+ | lest:lest<cnjadv> |
||
+ | neither:neither<cnjcoo> |
||
+ | nor:nor<cnjcoo> |
||
+ | or:or<cnjcoo> |
||
+ | since:since<cnjadv> |
||
+ | than:than<cnjadv> |
||
+ | than:than<cnjsub> |
||
+ | that:that<cnjsub> |
||
+ | then:then<cnjadv> |
||
+ | though:though<cnjadv> |
||
+ | til:til<cnjadv> |
||
+ | till:till<cnjadv> |
||
+ | unless:unless<cnjadv> |
||
+ | until:until<cnjadv> |
||
+ | unto:unto<cnjadv> |
||
+ | what:what<cnjsub> |
||
+ | whenas:whenas<cnjadv> |
||
+ | whence:whence<cnjadv> |
||
+ | when:when<cnjadv> |
||
+ | wherealong:wherealong<cnjadv> |
||
+ | whereas:whereas<cnjadv> |
||
+ | whereat:whereat<cnjadv> |
||
+ | wherefore:wherefore<cnjadv> |
||
+ | whereinbefore:whereinbefore<cnjadv> |
||
+ | wherein:wherein<cnjadv> |
||
+ | whereof:whereof<cnjadv> |
||
+ | whereout:whereout<cnjadv> |
||
+ | whereover:whereover<cnjadv> |
||
+ | wheresoever:wheresoever<cnjadv> |
||
+ | whether:whether<cnjadv> |
||
+ | which:which<cnjsub> |
||
+ | while:while<cnjadv> |
||
+ | whilst:whilst<cnjadv></nowiki> |
||
− | == Conjunctions (<span style="font-variant: small-caps">Conj</span>== |
||
+ | == Interjections (<span style="font-variant: small-caps">Prt</span>)== |
||
− | <nowiki>> and |
||
− | and and<cnjcoo> 0,000000 |
||
− | > unless |
||
− | unless unless<cnjsub> 0,000000</nowiki> |
||
− | No cnjadv? |
||
− | |||
− | == Interjections (<span style="font-variant: small-caps">Prt</span>== |
||
− | |||
− | <nowiki>> crappy |
||
− | crappy crappy<ij> 0,000000 |
||
− | |||
− | > hi |
||
− | hi hi<ij> 0,000000</nowiki> |
||
== Punctuations (Google pos: .)== |
== Punctuations (Google pos: .)== |
||
+ | These are more or less same everywhere apart from directionality and some orthographic variation. |
||
− | <nowiki>> . |
||
− | . .<sent> 0,000000 |
||
− | |||
− | > " |
||
− | " "<lquot> 0,000000 |
||
− | " "<rquot> 0,000000 |
||
− | " "<sent> 0,000000 |
||
− | |||
− | > ) |
||
− | ) )<rpar> 0,000000 |
||
− | |||
− | > ( |
||
− | ( (<lpar> 0,000000 |
||
− | |||
− | > , |
||
− | , ,<cm> 0,000000 |
||
+ | <nowiki>':'<apos> |
||
− | > - |
||
+ | ,:,<cm> |
||
− | - -<guio> 0,000000</nowiki> |
||
+ | -:-<guio> |
||
+ | --:–<guio> |
||
+ | –:-<guio> |
||
+ | —:—<guio> |
||
+ | (:(<lpar> |
||
+ | [:[<lpar> |
||
+ | ":"<lquot> |
||
+ | “:“<lquot> |
||
+ | «:«<lquot> |
||
+ | »:«<lquot> |
||
+ | ):)<rpar> |
||
+ | ]:]<rpar> |
||
+ | ":"<rquot> |
||
+ | ”:”<rquot> |
||
+ | »:»<rquot> |
||
+ | (:(<lpar> |
||
+ | ):)<rpar> |
||
+ | _:_<sent> |
||
+ | :::<sent> |
||
+ | ;:;<sent> |
||
+ | !:!<sent> |
||
+ | ?:?<sent> |
||
+ | .:.<sent> |
||
+ | #:#<sent> |
||
+ | %:%<sent></nowiki> |
Revision as of 02:14, 8 August 2014
Contents
RFC for English tags, eh?
Verbs (google pos: Verb)
Regular English verbs inflect in these forms: _accept_, _accepts_, _accepted_, _accepting_. Some irregular verbs have like five: _forget_, _forgets_, _forgot_, _forgotten_, _forgetting_. The verb to _be_ has bunch of forms: _be_, _am_, _are_, _is_, _was_, _were_, _been_, _being_.
The tags we are using to classify English verbs are:
* vblex: for regular verbs * vaux: auxiliary verbs; that have verb complement * vbser: verb _be_ * vbdo: verb _do_ * vbhaver: verb _have_
The morphs coming after (or lack of them) are classified with:
* inf: infinitive (as in: to _do_, to _walk_) * pri: present indicative (as in: I _do_, he _walks_) * prs: present subjunctive * past: common past (as in: I _did_, he _walked_) * pis: imperfect subjunctive * pp: past participle (as I've _done_, he has _walked_) * pprs: present participle * ger: gerund * subs: substantive
and potentially
* +not.adv.neg: (as in _can't_, _didn't_)
In future likely:
* transitivity
The tag sequences are as follows:
Regular verbs:
walk:walk<vblex><inf> walk:walk<vblex><pri> walk:walk<vblex><prs> walk:walk<vblex><imp> walks:walk<vblex><pri><p3><sg> walked:walk<vblex><pis> walked:walk<vblex><past> walked:walk<vblex><pp> walking:walk<vblex><subs> walking:walk<vblex><pprs> walking:walk<vblex><ger>
Irregulars:
forget:forget<vblex><inf> forget:forget<vblex><pri> forgets:forget<vblex><pri><p3><sg> forgot:forget<vblex><past> forgotten:forget<vblex><pp> forgetting:forget<vblex><ger>
Auxiliaries:
can:can<vaux><pri> could:can<vaux><past> can't:can<vaux><pri>+not<adv> cannot:can<vaux><pri>+not<adv> couldn't:can<vaux><past>+not<adv> may:may<vaux><pri> may:may<vaux><past> might:might<vaux><pri> might:might<vaux><past> must:must<vaux><pri> must:must<vaux><past> ought:ought<vaux><pri> ought:ought<vaux><past> shall:shall<vaux><pri> should:shall<vaux><past> shan't:shall<vaux><pri>+not<adv> shouldn't:shall<vaux><past>+not<adv> will:will<vaux><pri> would:will<vaux><past> won't:will<vaux><pri>+not<adv> wouldn't:will<vaux><past>+not<adv>
Verb have:
have:have<vbhaver><inf> have:have<vbhaver><pri> has:have<vbhaver><pri><p3><sg> had:have<vbhaver><past> having:have<vbhaver><ger>
Verb do:
do:do<vbdo><inf> do:do<vbdo><imp> do:do<vbdo><pri> does:do<vbdo><pri><p3><sg> did:do<vbdo><past> did:do<vbdo><pis> doing:do<vbdo><subs> doing:do<vbdo><pprs> doing:do<vbdo><ger> done:do<vbdo><pp>
Nouns (google pos: Noun)
Nouns have commonly two forms and possessives along them: _beer_, _beers_, _beer's_ , _beers'_. Some don't: ?
The tags used to classify nouns are:
* n: regular noun * np: proper noun * m: male * f: female * mf: both female and male * nt: neuter female nor male * top: place * ant: human
And also:
* cnt * unc
the suffixes are:
* sg: singular * pl: plural * gen: genitive or possessive or somehting
Regular nouns go like:
beer:beer<n><sg> beers:beer<n><pl> beer's:beer<n><sg><gen> beers':beer<n><pl><gen>
Proper nouns:
Aaron:Aaron<np><ant><m><sg> Aarons:Aaron<np><ant><m><pl> Aarons':Aaron<np><ant><m><pl><gen> Aaron's:Aaron<np><ant><m><sg><gen> Amsterdam:Amsterdam<np><top><sg> Amsterdams:Amsterdam<np><top><pl> Amsterdam's:Amsterdam<np><top><sg><gen> Amsterdams':Amsterdam<np><top><pl><gen>
Adjectives (google pos: Adj)
Adjectives mostly don't do anything, like _hairy_, but some have three forms, like: _small_, _smaller_, _smallest_. The tags used for classifying are:
* adj: for non-inflecting ones * sint: for those with three forms
the suffixes are marked with:
* comp. for comparative * sup for superlative
Like so:
small:small<adj><sint> smaller:small<adj><sint><comp> smallest:small<adj><sint><sup> hairy:hairy<adj>
Adverbs (google pos: Adv)
Adverbs are adverbs. They use the tag adv:
aboard:aboard<adv> drunk:drunk<adv> no:no<adv><neg> where:where<adv><itg> when:when<adv><itg> why:why<adv><itg>
Some have other tags too.
Pronouns (google pos: Pron)
There's a lot of different pronouns.
anybody:anybody<prn><sg> anyone:anyone<prn><sg> anything:anything<prn><sg> both:both<prn><pl> everybody:everybody<prn><sg> everyone:everyone<prn><sg> everything:everything<prn><sg> few:few<prn><pl> he:he<prn><pers><p3><m><sg> his:he<prn><pers><p3><m><sg><poss> his:he<prn><pers><p3><m><sg><gen> him:he<prn><pers><p3><m><sg><acc> herself:herself<prn><ref><p3><f><sg> himself:himself<prn><ref><p3><m><sg> hisself:himself<prn><ref><p3><m><sg> I:I<prn><pers><p1><mf><sg> me:I<prn><pers><p1><mf><sg><acc> my:I<prn><pers><p1><mf><sg><gen> mine:I<prn><pers><p1><mf><sg><poss> it:it<prn><dem><sg> its:it<prn><dem><sg><poss> itself:itself<prn><ref><p3><nt><sg> myself:myself<prn><ref><p1><mf><sg> oneself:oneself<prn><ref><p1><mf><sg> oneself:oneself<prn><ref><p3><mf><sg> one's self:oneself<prn><ref><p1><mf><sg> one's self:oneself<prn><ref><p3><mf><sg> ourself:ourselves<prn><ref><p1><mf><pl> ourselves:ourselves<prn><ref><p1><mf><pl> several:several<prn><sg> she:she<prn><pers><p3><m><sg> hers:she<prn><pers><p3><m><sg><poss> her:she<prn><pers><p3><m><sg><gen> her:she<prn><pers><p3><m><sg><acc> something:something<prn><sg> that:that<prn><rel> that:that<prn><sg> those:that<prn><pl> theirselves:themselves<prn><ref><p3><mf><pl> themself:themself<prn><ref><p3><mf><sg> themselves:themselves<prn><ref><p3><mf><sg> themselves:themselves<prn><ref><p3><mf><pl> they:they<prn><pers><p3><mf><pl> their:they<prn><pers><p3><mf><pl><gen> theirs:they<prn><pers><p3><mf><pl><poss> them:they<prn><pers><p3><m><sg><acc> this:this<prn><sg> these:this<prn><pl> thyself:thyself<prn><ref><p2><mf><sg> we:we<prn><pers><p1><mf><pl> us:we<prn><pers><p1><mf><pl><acc> our:we<prn><pers><p1><mf><pl><gen> ours:we<prn><pers><p1><mf><pl><poss> which:which<prn><itg> which:which<prn><rel> who:who<prn><itg> whose:who<prn><poss> whom:who<prn><itg><acc> you:you<prn><pers><p2><mf><sp> yours:you<prn><pers><p2><mf><sp><poss> your:you<prn><pers><p2><mf><sp><gen> you:you<prn><pers><p2><mf><sp><acc> yourself:yourself<prn><ref><p2><mf><sg> yourselves:yourselves<prn><ref><p2><mf><pl>
Determiners (Det)
There are couple of determiners:
a:>:a<det><ind><sg> an:>:a<det><ind><sg> ~a:<:a<det><ind><sg> both:both<det><qnt> many:many<det><qnt> no:no<det><ind><neg> several:several<det><dem> that:th<det><dem><sg> those:th<det><dem><pl> the:the<det><def><sp> this:th<det><dem><sg> these:th<det><dem><pl> which:which<det><itg><sp>
Prepositions (Adp)
above:above<pr> according to:according to<pr> across:across<pr> after:after<pr> against:against<pr> along:along<pr> alongside:alongside<pr> along with:along with<pr> amid:amid<pr> among:among<pr> amongst:amongst<pr> around:around<pr> as:as<pr> as of:as of<pr> at:at<pr> atop:atop<pr> because of:because of<pr> before:before<pr> behind:behind<pr> below:below<pr> between:between<pr> but:but<pr> by:by<pr> by means of:by means of<pr> despite:despite<pr> due to:due to<pr> during:during<pr> except for:except for<pr> except:except<pr> for:for<pr> from:from<pr> in contrast to:in contrast to<pr> in front of:in front of<pr> in:in<pr> in order to:in order to<pr> inside:inside<pr> into:into<pr> near:near<pr> off:off<pr> of:of<pr> on:on<pr> onto:onto<pr> out:out<pr> out of:out of<pr> outside:outside<pr> over:over<pr> per:per<pr> prior to:prior to<pr> since:since<pr> through:through<pr> throughout:throughout<pr> to:to<pr> towards:towards<pr> under:under<pr> until:until<pr> up:up<pr> upon:upon<pr> up to:up to<pr> via:via<pr> within:within<pr> with:with<pr> without:without<pr>
Numerals (Num)
There's quite a bit of number words in existence:
one:one<num><sg> one's:one<num><sg><gen> two:two<num><pl> two's:two<num><pl><gen> three:three<num><pl> three's:three<num><pl><gen> first:first<num><pl> first's:first<num><pl><gen> second:second<num><pl> second's:second<num><pl><gen> third:third<num><pl> third's:third<num><pl><gen>
Conjunctions (Conj)
Some classes for conjuncions:
albeit:albeit<cnjadv> albeit:albeit<cnjsub> although:although<cnjadv> and:and<cnjcoo> an if:an if<cnjadv> because:because<cnjadv> because:because<cnjsub> both:both<cnjcoo> but:but<cnjcoo> either:either<cnjadv> however:however<cnjadv> if:if<cnjadv> if:if<cnjsub> lest:lest<cnjadv> neither:neither<cnjcoo> nor:nor<cnjcoo> or:or<cnjcoo> since:since<cnjadv> than:than<cnjadv> than:than<cnjsub> that:that<cnjsub> then:then<cnjadv> though:though<cnjadv> til:til<cnjadv> till:till<cnjadv> unless:unless<cnjadv> until:until<cnjadv> unto:unto<cnjadv> what:what<cnjsub> whenas:whenas<cnjadv> whence:whence<cnjadv> when:when<cnjadv> wherealong:wherealong<cnjadv> whereas:whereas<cnjadv> whereat:whereat<cnjadv> wherefore:wherefore<cnjadv> whereinbefore:whereinbefore<cnjadv> wherein:wherein<cnjadv> whereof:whereof<cnjadv> whereout:whereout<cnjadv> whereover:whereover<cnjadv> wheresoever:wheresoever<cnjadv> whether:whether<cnjadv> which:which<cnjsub> while:while<cnjadv> whilst:whilst<cnjadv>
Interjections (Prt)
Punctuations (Google pos: .)
These are more or less same everywhere apart from directionality and some orthographic variation.
':'<apos> ,:,<cm> -:-<guio> --:–<guio> –:-<guio> —:—<guio> (:(<lpar> [:[<lpar> ":"<lquot> “:“<lquot> «:«<lquot> »:«<lquot> ):)<rpar> ]:]<rpar> ":"<rquot> ”:”<rquot> »:»<rquot> (:(<lpar> ):)<rpar> _:_<sent> :::<sent> ;:;<sent> !:!<sent> ?:?<sent> .:.<sent> #:#<sent> %:%<sent>