Difference between revisions of "Constraint Grammar"
Jump to navigation
Jump to search
Line 3: | Line 3: | ||
==Terminology== |
==Terminology== |
||
{{see-also|Apertium stream format}} |
{{see-also|Apertium stream format}} |
||
* ''cohort'' — a [[surface form]] of a word, along with its analyses (possible [[lexical unit]]s). |
* ''cohort'' — a [[surface form]] of a word, along with its analyses (possible [[lexical unit]]s), an ''ambiguous'' lexical unit. |
||
::Apertium equivalent: <code>^words/word<n><pl>/word<vblex><pres><p3><sg>$</code> |
::Apertium equivalent: <code>^words/word<n><pl>/word<vblex><pres><p3><sg>$</code> |
||
* ''baseform'' — the [[lemma]] of a word. |
* ''baseform'' — the [[lemma]] of a word. |
Revision as of 21:23, 8 May 2009
Constraint Grammar is a tool that can be used to POS-tag ambiguous text. There are free constraint grammars developed outside the Apertium project for: Norwegian (the Oslo-Bergen tagger), Sámi languages (from Giellatekno) and Faroese (also from Giellatekno).
Terminology
- See also: Apertium stream format
- cohort — a surface form of a word, along with its analyses (possible lexical units), an ambiguous lexical unit.
- Apertium equivalent:
^words/word<n><pl>/word<vblex><pres><p3><sg>$
- Apertium equivalent:
- baseform — the lemma of a word.
- reading — a single analysis of a word.
- Apertium equivalent:
^word<n><pl>$
- Apertium equivalent:
- wordform — a surface form of a word.
See also
External links