Subreadings in Constraint Grammar
Contents
Current situation
Typical input with sub-readings:
^foobar/foo+bar/fubar/flue+barge$
Right now, only the last sub-reading is used, in the above example, vislcg3 treats it as if it were
^foobar/bar/fubar/barge$
This works great for compounds where the stuff before the + is mostly inconsequential, while for other multiword expressions it is not so good... (Also, mapping tags are only put on the last sub-reading now.)
- Wait can't we just split on the + with pretransfer before sending this to cg-proc?
- No, because we first have to disambiguate between eg. ^foobar/foo+bar/fubar/flue+barge$ (what would that even look like if split? wouldn't work)
What we need
- We may need to refer to an earlier sub-reading in order to disambiguate
- We may want to put a mapping tag on an earlier sub-reading
- And of course we want to be able to refer to the last as in the current situation
Referring to the final sub-reading
Northern Sámi postpositions take genitive.
Input fragment:
^soahtefámu/soahti<N><Sg><Nom><Cmp>+fápmu<N><Sg><Acc>/soahti<N><Sg><Nom><Cmp>+fápmu<N><Sg><Gen>$ ^vuostá/vuostá<Po>/vuostá<Pr>/vuostá<N><Sg><Nom>$
Correct output:
^soahtefámu/soahti<N><Sg><Nom><Cmp>+fápmu<N><Sg><Gen><@→P>$ war.power.GEN ^vuostá/vuostá<Po><@←ADVL>$^ against.PO
If the input noun were unambiguously nominative, the Po reading should not be selected, so we might have a rule somewhere with
REMOVE Po if (-1 (Nom))
but if this matched non-final sub-readings, we would get the wrong tagging here. Currently, non-final sub-readings are ignored, so the sme-nob CG's work fine (as do the nn-nb ones for compounding there).
Referring to non-final sub-readings
Input:
^D'an/Da<pr>+an<det><def><sp>$ ^emgann/emgann<n><m><sg>$ ^ez/e<vpart><obj>/ael<n><m><pl>/mont<vblex><pri><p2><sg>/monet<vblex><pri><p2><sg>/e<pr>+da<det><pos><mf><sp>$ ^an/an<det><def><sp>/mont<vblex><pri><p1><sg>/monet<vblex><pri><p1><sg>$
Correct output:
^D'an/Da<pr><@ADVL→>+an<det><def><sp><@→N>$ to.the ^emgann/emgann<n><m><sg><@P←>$ battle ^ez/e<vpart><obj><@Pcle>$ PART ^an/mont<vblex><pri><p1><sg><@+FMAINV>$ I.go
- We want to refer to the <pr> sub-reading when mapping emgann as @P← (possibly also in disambiguation).
- We want to MAP an @ADVL→ tag on the <pr> sub-reading (also a @→N tag on the determiner). These sub-readings are split into two units by pretransfer.
Some file
SECTION SUBSTITUTE ("од") ("од:5") ("од") (-1 (adj)); ^помладо/adj<pref><comp>+млад<adj><nt><sg><nom><ind>$ ^од/од<pr>$ ^30/30<num>$^./.<sent>$
MAP (@+FMAINV) TARGET VerbFin ; ^n'eus/ne<adv>+bezañ<vblex><pri><impers><sp>/ne<adv>+kaout<vblex><pri><p1><pl>$ ^kador/kador<n><f><sg>$ ^ebet/ebet<adv>$^./.<sent>$