Talk:Subreadings in Constraint Grammar
Jump to navigation
Jump to search
More discussion
<TinoDidriksen> +parts are hidden, currently... <TinoDidriksen> Or rather, stuff before the + is hidden. <|krvoje|> o_0 <TinoDidriksen> Also, the + is not part of the tag. <TinoDidriksen> +htjeeti becomes a baseform "htjeeti" <spectie> our thing is a horrible hack :(((( <TinoDidriksen> + is a horrible hack... <spectie> it's great for apertium, but just doesn't fit into CG yet <spectie> what cg-proc should do <spectie> is treat the parts separated by '+' as separate cohorts <TinoDidriksen> That can relatively easily be done, but sure about that? http://wiki.apertium.org/wiki/Subreadings_in_Constraint_Grammar did not mention that option.
- You can't treat clitics as separate cohorts, how would that even work? Say you have
^foo bar/foo<tags>+bar<tags>/foobar<tags>$
, should it be treated as "foo<tags>/foobar<tags> followed by a single reading bar<tags>", or "foo<tags> followed by a bar<tags>/foobar<tags>" ? That sounds like a type of complexity we don't want. unhammer 07:53, 12 October 2011 (UTC)
<spectie> the problem is that sometimes we don't want that :)) <spectie> but i think that's a problem with apertium <TinoDidriksen> Ah <spectie> we should distinguish + and # <spectie> + should be for separate cohorts <spectie> # for separate parts of the same cohort <spectie> or something <spectie> e.g. compound words get # <spectie> and attached clitics get + <|krvoje|> bbiab, lunch <spectie> ok <TinoDidriksen> So, + should be regarded as a soft cohort split thingy... <spectie> yes <spectie> but we should also check with unhammer
I really think it makes sense to be able to distinguish these two things, we can't use '+' for both:
- Attached clitics / joined words (each word needs to be referred to separately)
- Compound words (we're only interested in the head)
I suggest coming up with some other thing for compounds (perhaps ~
, although I think we've discussed this before ?
- Francis Tyers 13:52, 11 July 2011 (UTC)