Difference between revisions of "Constraint Grammar"
Jump to navigation
Jump to search
Line 19: | Line 19: | ||
LIST word = "hello" ; |
LIST word = "hello" ; |
||
SELECT: |
SELECT:1a (tag1) (1 word) ; |
||
SELECT: |
SELECT:1b set1 (1 word) ; # equivalent to 1a |
||
SELECT: |
SELECT:2a (tag1 tag2) (1 word) ; |
||
SELECT: |
SELECT:2b set2 (1 word) ; # equivalent to 2a |
||
SELECT: |
SELECT:3a tag1 (1 word) ; |
||
SELECT: |
SELECT:3b tag2 (1 word) ; |
||
SELECT: |
SELECT:3c set3 (1 word) ; # equivalent to 3a and 3b combined |
||
SELECT: |
SELECT:1c set1 (1 ("hello")) ; # equivalent to 1a (or 1b) |
||
==Languages using CG in Apertium== |
==Languages using CG in Apertium== |
Revision as of 16:32, 29 May 2010
Constraint Grammar is a tool that can be used to POS-tag ambiguous text. There are free constraint grammars developed outside the Apertium project for: Norwegian (the Oslo-Bergen tagger), Sámi languages (from Giellatekno) and Faroese (also from Giellatekno).
Terminology
- See also: Apertium stream format
- cohort — a surface form of a word, along with its analyses (possible lexical units), an ambiguous lexical unit.
- Apertium equivalent:
^words/word<n><pl>/word<vblex><pres><p3><sg>$
- Apertium equivalent:
- baseform — the lemma of a word.
- reading — a single analysis of a word.
- Apertium equivalent:
^word<n><pl>$
- Apertium equivalent:
- wordform — a surface form of a word.
Note on parenthesis
Parentheses, and the distinction between tags and lists/sets, seem to be the main confusing point for people learning CG. If we have the morphological tags tag1
and tag2
, then we can have rules like this:
LIST set1 = tag1 ; LIST set2 = (tag1 tag2) ; # matches a word with both tag1 and tag2 LIST set3 = tag1 tag2 ; # matches a word with tag1 or tag2 LIST word = "hello" ; SELECT:1a (tag1) (1 word) ; SELECT:1b set1 (1 word) ; # equivalent to 1a SELECT:2a (tag1 tag2) (1 word) ; SELECT:2b set2 (1 word) ; # equivalent to 2a SELECT:3a tag1 (1 word) ; SELECT:3b tag2 (1 word) ; SELECT:3c set3 (1 word) ; # equivalent to 3a and 3b combined SELECT:1c set1 (1 ("hello")) ; # equivalent to 1a (or 1b)
Languages using CG in Apertium
See also
- Apertium and Constraint Grammar -- installation and use
- Introduksjon til føringsgrammatikk -- a HOWTO, in Norwegian bokmål