Constructing a TSX file with a Constraint Grammar

From Apertium
Revision as of 11:10, 2 March 2008 by Francis Tyers (talk | contribs) (New page: Constraint Grammar ==Terminology== * cohort — set of analyses for a given surface form. ==Labels== Coarse tag "labels" in Constraint Grammar (CG) are specified either as {{sc|lis...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Constraint Grammar

Terminology

  • cohort — set of analyses for a given surface form.

Labels

Coarse tag "labels" in Constraint Grammar (CG) are specified either as list or set. Sometimes however, these are not complete sets, so may need to be combined.

For example:

LIST A-N-CC = A N CC ;
LIST A-pos = (A Pos) ;
LIST %etter/fram/opp% = ("etter" Pr) ("fram" Pr) ("frem" Pr) ("opp" Pr) ;

Is three lists, expressed in TSX format as below:

  <def-label name="A-N-CC">
    <tags-item tags="adj.*"/>
    <tags-item tags="n.*"/>
    <tags-item tags="cnjcoo"/>
  </def-label>
  <def-label name="A-pos">
    <tags-item tags="adj.pos.*"/>
  </def-label>
  <def-label name="%etter/fram/opp%">
    <tags-item lemma="etter" tags="pr"/>
    <tags-item lemma="fram" tags="pr"/>
    <tags-item lemma="frem" tags="pr"/>
    <tags-item lemma="opp" tags="pr"/>
  </def-label>

etc. Note that this may cause some problems, so it might be best to attempt this using only ambiguous tags to start with.

Constraints

Constraint Grammar uses a series of hand-written constraints in order to POS-tag ambiguous words.

Forbid rules

The operation analagous to a forbid rule is remove.

Enforce rules

The operation analagous to an enforce rule is select, which "selects a reading, if it contains a TARGETed tag. In practice, selection is equivalent to a removal of all other readings."

# 2355
SELECT (N) IF
        (-1C N-gen)
        (NOT 1 A-N-CC)
;
  <enforce-after 

Prefer tags