Difference between revisions of "Talk:Apertium and Constraint Grammar"
(whatever this was it's outdated now) |
|||
(6 intermediate revisions by 3 users not shown) | |||
Line 117: | Line 117: | ||
==Current bugs== |
==Current bugs== |
||
== Wishlist == |
|||
== Compiling vislcg3 on Mac == |
|||
=== <strike>Ability to specify where a MAPPING tag should be added in the tag_list</strike> === |
|||
Tags in vislcg3 are "unordered", but the input order is preserved, and MAPPING tags are added to the end. However, since Apertium matches longest left-to-right strings, we may have to disambiguate between |
|||
<code>ganga# i<vblex></code> and <code>ganga<vblex>+i<pr></code>. The first one is easy, "ganga# i" is seen as the baseform and there is just one tag, vblex, we might get something like <code>ganga# i<vblex><@FVMAIN></code>. The second one is worse. The + means that the multiword should be split into two before transfer, <code>^ganga<vblex>$ ^i<pr>$</code>; but if the mapping tags go to the end, or even after the first word, we'll get <code>^ganga<vblex><@FVMAIN><@PART>+i<pr>$</code> or <code>^ganga<vblex>+i<pr><@FVMAIN><@PART>$</code>, but we want <code>^ganga<vblex><@FVMAIN>+i<pr><@PART>$</code>. |
|||
'''CG Syntax change''': |
|||
I kept getting this error |
|||
We could say something like |
|||
<pre> |
<pre> |
||
MAP (@FVMAIN) TARGET VPart:0 (1* FOO); |
|||
ld: symbol(s) not found |
|||
MAP (@PART) TARGET VPart:1 (-1* BAR); |
|||
collect2: ld returned 1 |
|||
</pre> |
</pre> |
||
on my Mac. According to [http://www.justatheory.com/computers/databases/postgresql/howto_avoid_tigers_readline.html this], it's related to GNU readline not being discovered. I tried installing GNU readline from both MacPorts and source, no help, tried using CPPFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib, even moving Apple's readline files out of the way, finally gave up and tried the ./compile-mac.sh script, but this didn't give me the cg-comp and cg-proc files; but then on my next try, after make distclean, it compiled (no CPPFLAGS/LDFLAGS). I still don't know why, but hey, it works. |
|||
: This is better done using [[Subreadings]]. |
|||
:Strange. I don't have a mac to test this on, but on the other macs I've tried it compiled without problems. (Apart from having to install ICU) - [[User:Francis Tyers|Francis Tyers]] 22:42, 21 March 2009 (UTC) |
Latest revision as of 11:00, 18 September 2014
- Window = whole of what we're looking at; several sentences at the same time.
- SingleWindow = one sentence (for want of a better term). Usually there's 3 SingleWindow in a Window, but that's runtime defined. Can be anywhere from 1 to hundreds set with --num-windows
- Cohort = one
Contents
Testing[edit]
Regression test status as of 22:07, 17 April 2008 (BST)
Running tests... T_AnyMinusSome: Fail. T_Barrier: Success. T_BasicAppend: Success. T_BasicContextTest: Success. T_BasicDelimit: Success. T_BasicIff: Success. T_BasicRemove: Success. T_BasicSelect: Success. T_BasicSubstitute: Success. T_CarefulBarrier: Fail. T_CompositeSelect: Success. T_DontMatchEmptySet: Fail. T_EndlessSelect: Fail. T_MapAdd_Different: Fail. T_MatchBaseform: Success. T_MatchWordform: Success. T_MultipleSections: Success. T_NegatedContextTest: Success. T_RegExp_Map: Fail. T_RegExp_Select: Fail. T_RemoveSingleTag: Fail. T_ScanningTests: Success. T_Sections: Fail. T_SetOp_FailFast: Success. T_SetOp_OR: Success. T_SpaceInWord: Success. T_SuperBlanks: Success. T_Unification: Fail. T_UnknownWord: Success.
Regression test status as of 10:46, 3 July 2008 (UTC)
T_AnyMinusSome: Fail. T_Barrier: Success. T_BasicAppend: Fail. T_BasicContextTest: Success. T_BasicDelimit: Success. T_BasicIff: Success. T_BasicRemove: Success. T_BasicSelect: Success. T_BasicSubstitute: Success. T_CarefulBarrier: Fail. T_CompositeSelect: Success. T_DontMatchEmptySet: Fail. T_EndlessSelect: Fail. T_MapAdd_Different: Fail. T_MatchBaseform: Success. T_MatchWordform: Success. T_MultipleSections: Success. T_MultiWords: Success. T_NegatedContextTest: Success. T_RegExp_Map: Fail. T_RegExp_Select: Fail. T_RemoveSingleTag: Fail. T_ScanningTests: Fail. T_Sections: Fail. T_SetOp_FailFast: Success. T_SetOp_OR: Success. T_SpaceInWord: Success. T_SuperBlanks: Success. T_SuperBlanksNewline: Success. T_Unification: Fail. T_UnknownWord: Success.
Regression test status as of 07:30, 17 July 2008 (UTC)
T_AnyMinusSome: Success. T_Barrier: Success. T_BasicAppend: Success. T_BasicContextTest: Success. T_BasicDelimit: Success. T_BasicIff: Success. T_BasicRemove: Success. T_BasicSelect: Success. T_BasicSubstitute: Success. T_CarefulBarrier: Success. T_CompositeSelect: Success. T_DontMatchEmptySet: Success. T_EndlessSelect: Fail. T_Joiner: Success. T_MapAdd_Different: Success. T_MatchBaseform: Success. T_MatchWordform: Success. T_MultipleSections: Success. T_MultiWords: Success. T_NegatedContextTest: Success. T_RegExp_Map: Success. T_RegExp_Select: Success. T_RegExp_Substitute: Success. T_RemoveSingleTag: Fail. T_ScanningTests: Success. T_Sections: Fail. T_SetOp_FailFast: Success. T_SetOp_OR: Success. T_SpaceInWord: Success. T_SuperBlanks: Success. T_SuperBlanksNewline: Success. T_Unification: Fail. T_UnknownWord: Success.
Current bugs[edit]
Wishlist[edit]
Ability to specify where a MAPPING tag should be added in the tag_list[edit]
Tags in vislcg3 are "unordered", but the input order is preserved, and MAPPING tags are added to the end. However, since Apertium matches longest left-to-right strings, we may have to disambiguate between
ganga# i<vblex>
and ganga<vblex>+i<pr>
. The first one is easy, "ganga# i" is seen as the baseform and there is just one tag, vblex, we might get something like ganga# i<vblex><@FVMAIN>
. The second one is worse. The + means that the multiword should be split into two before transfer, ^ganga<vblex>$ ^i<pr>$
; but if the mapping tags go to the end, or even after the first word, we'll get ^ganga<vblex><@FVMAIN><@PART>+i<pr>$
or ^ganga<vblex>+i<pr><@FVMAIN><@PART>$
, but we want ^ganga<vblex><@FVMAIN>+i<pr><@PART>$
.
CG Syntax change: We could say something like
MAP (@FVMAIN) TARGET VPart:0 (1* FOO); MAP (@PART) TARGET VPart:1 (-1* BAR);
- This is better done using Subreadings.