Difference between revisions of "Constraint Grammar"

From Apertium
Jump to navigation Jump to search
Line 10: Line 10:
 
::Apertium equivalent: <code>^word<n><pl>$</code>
 
::Apertium equivalent: <code>^word<n><pl>$</code>
 
* ''wordform'' &mdash; a [[surface form]] of a word.
 
* ''wordform'' &mdash; a [[surface form]] of a word.
  +
  +
==Note on parenthesis==
  +
Parentheses, and the distinction between tags and lists/sets, seem to be the main confusing point for people learning CG. If we have the morphological tags <code>tag1</code> and <code>tag2</code>, then we can have rules like this:
  +
  +
LIST set1 = tag1 ;
  +
LIST set2 = (tag1 tag2) ; # matches a word with both tag1 and tag2
  +
LIST set3 = tag1 tag2 ; # matches a word with tag1 or tag2
  +
LIST word = "hello" ;
  +
  +
SELECT:rule1a (tag1) (1 word) ;
  +
SELECT:rule1b set1 (1 word) ; # equivalent to rule1a
  +
  +
SELECT:rule2a (tag1 tag2) (1 word) ;
  +
SELECT:rule2b set2 (1 word) ; # equivalent to rule2a
  +
  +
SELECT:rule3a tag1 (1 word) ;
  +
SELECT:rule3b tag2 (1 word) ;
  +
SELECT:rule3c set3 (1 word) ; # equivalent to rule3a and rule3b combined
  +
  +
SELECT:rule1c set1 (1 ("hello")) ; # equivalent to rule1a (or rule1b)
   
 
==Languages using CG in Apertium==
 
==Languages using CG in Apertium==

Revision as of 16:31, 29 May 2010

Constraint Grammar is a tool that can be used to POS-tag ambiguous text. There are free constraint grammars developed outside the Apertium project for: Norwegian (the Oslo-Bergen tagger), Sámi languages (from Giellatekno) and Faroese (also from Giellatekno).

Terminology

See also: Apertium stream format
Apertium equivalent: ^words/word<n><pl>/word<vblex><pres><p3><sg>$
  • baseform — the lemma of a word.
  • reading — a single analysis of a word.
Apertium equivalent: ^word<n><pl>$

Note on parenthesis

Parentheses, and the distinction between tags and lists/sets, seem to be the main confusing point for people learning CG. If we have the morphological tags tag1 and tag2, then we can have rules like this:

LIST set1 = tag1 ;
LIST set2 = (tag1 tag2) ; # matches a word with both tag1 and tag2
LIST set3 = tag1 tag2 ;   # matches a word with tag1 or tag2
LIST word = "hello" ;
SELECT:rule1a (tag1) (1 word) ;
SELECT:rule1b  set1  (1 word) ;   # equivalent to rule1a

SELECT:rule2a (tag1 tag2) (1 word) ;
SELECT:rule2b  set2       (1 word) ;   # equivalent to rule2a

SELECT:rule3a tag1 (1 word) ;
SELECT:rule3b tag2 (1 word) ;
SELECT:rule3c set3 (1 word) ;   # equivalent to rule3a and rule3b combined
SELECT:rule1c  set1  (1 ("hello")) ; # equivalent to rule1a (or rule1b)

Languages using CG in Apertium

See also

External links