Difference between revisions of "User:Irene/workplan"
(4 intermediate revisions by the same user not shown) | |||
Line 24: | Line 24: | ||
6/19: Still trying to get FST example to compile on my computer. Worked on the reordering module for English in c++ <br /> |
6/19: Still trying to get FST example to compile on my computer. Worked on the reordering module for English in c++ <br /> |
||
6/20: Spent a lot of time trying to fix hashing errors with the c++ prototype, and then gave up with c++. Switched everything to python because I've been spending too much time working out c++ API. Wrote documentation. Still trying to get [https://github.com/unittest-cpp/unittest-cpp unittest++] and the fst example to compile. <br /> |
6/20: Spent a lot of time trying to fix hashing errors with the c++ prototype, and then gave up with c++. Switched everything to python because I've been spending too much time working out c++ API. Wrote documentation. Still trying to get [https://github.com/unittest-cpp/unittest-cpp unittest++] and the fst example to compile. <br /> |
||
6/21: Separated the prototype (now scripted in python) into organized files. |
|||
6/21: Separated the prototype (now scripted in python) into organized files. The reordering module for English and Spanish are under control (there is still some tedious work to with them, which I was not able to get to today). I was I think I am feeling much more confident in being able to get the prototype working by the evaluation deadline this week. I had listed a bunch of languages in my original proposal, except I couldn't find data on separable verbs except for German (. I feel a little out-of-place for fiddling with languages that I am not familiar with. However, I am not confident at all that any of this will even be of use (other than the basic idea) when we try to integrate it into Apertium because most of my code will probably be replaced with existing Lttoolbox functions. |
|||
6/23: video meeting. |
6/23: video meeting. |
||
6/24: migrated to source forge. added testing set. |
|||
|| |
|| |
||
|- |
|- |
||
| || || |
| || || |
||
| |
|||
|- |
|- |
||
Line 35: | Line 37: | ||
|- |
|- |
||
| 5 || 6/26 - 7/2 || finite-state acceptor for 'take out' |
| 5 || 6/26 - 7/2 || hard-coded finite-state acceptor for 'take out' |
||
|| |
|| |
||
6/26: trying to make the program backtrack when it gets to <ANY_CHAR> or <ANY_TAG> |
6/26: trying to make the program backtrack when it gets to <ANY_CHAR> or <ANY_TAG> <br /> |
||
6/27: debugging & supporting any number of tags <br /> |
|||
6/28: successfully reads and prints ^take<vblex><pres><tag1><tag2><tag3><tag4>$ ^the<det><tag1><tag2><tag3><tag4>$ ^thing<n><sg><tag><Tag>$ ^out<adv>$ <br /> |
|||
6/29: working on being selective about what middle words are accepted <br /> |
|||
7/1: python prototype for acceptor is pretty much working, just needs to be able to read from corpuses that don't put every sentence on a new line, and to assign numbers to states in a more elegant fashion. <br /> |
|||
|- |
|- |
||
| 6 || 7/3 - 7/9 || finite-state transducer for 'take out' |
| 6 || 7/3 - 7/9 || hard-coded finite-state transducer for 'take out' |
||
|| |
|||
7/3: tried to convert the python script to c++ code. trying to use lttoolbox's FST class. <br /> |
|||
7/4: still trying to convert to c++ and use lttoolbox <br /> |
|||
7/5: |
|||
7/6: |
|||
7/7: |
|||
7/8: |
|||
7/9: |
|||
|- |
|- |
||
| 7 || 7/10 - 7/16 || |
| 7 || 7/10 - 7/16 || xml format, working compiler and processor |
||
|- |
|- |
||
| 8 || 7/17 - 7/23 || |
| 8 || 7/17 - 7/23 || improving dictionary, compiler, and processor |
||
|- |
|- |
||
!'''Second evaluation''' !! 7/24 - 7/28 !! |
!'''Second evaluation''' !! 7/24 - 7/28 !! finite-state implementation |
||
|- |
|- |
||
Line 55: | Line 69: | ||
|- |
|- |
||
| 10 || 7/31 - 8/6 || |
| 10 || 7/31 - 8/6 || superblanks, integration, fstp |
||
|| |
|||
7/31: worked on superblanks, used fstp object <br /> |
|||
8/1: improvements <br /> |
|||
8/2: insert superblanks between the # in e.g. 'take# out' and between words in counterexamples by amending 'in' and 'out' strings, fixed error where ^ was not printing at the end of reordering a sep. multiword, updated dictionary to improve success rate <br /> |
|||
|- |
|- |
||
| 11 || 8/7 - 8/13 || (cont. support for individual language pairs) |
| 11 || 8/7 - 8/13 || (cont. support for individual language pairs) |
||
|| |
|||
testing and refining for beta testing languages: kaz/kir, deu, eng, fao-nor |
|||
|- |
|- |
||
| 12 || 8/14 - 8/20 || (cont. support for individual language pairs) |
| 12 || 8/14 - 8/20 || (cont. support for individual language pairs) |
||
|| |
|||
support for +thing, lsx-comp appends <j/> to the end of every entry, before <e/>, causes issues with paradigm, remove feature. <br/> |
|||
|- |
|- |
||
8/17: |
|||
| 13 || 8/21 - 8/27 || (cont. support for individual language pairs) |
| 13 || 8/21 - 8/27 || (cont. support for individual language pairs) |
||
|- |
|- |
Latest revision as of 18:45, 17 August 2017
Workplan[edit]
8/17:Week | Dates | Goals | Progress/Notes | Evaluation |
---|---|---|---|---|
1 | 5/30 - 6/4 | some data, find test corpus | ||
2 | 6/5 - 6/11 | script to bootstrap separable multiwords from dictionaries, set up testing framework, support/preparing data for English separable verbs | ||
3 | 6/12 - 6/18 | preparing data, prototype script set up, read specifications of Lttoolbox API | ||
4 | 6/19 - 6/25 |
6/19: separate out the language-dependent functions in the c++ prototype, work on reordering module for Romance languages (Spanish, Portuguese) |
6/19: Still trying to get FST example to compile on my computer. Worked on the reordering module for English in c++ |
|
First evaluation | 6/26 - 6/30 | testing framework set up + prototype system in Python | ||
5 | 6/26 - 7/2 | hard-coded finite-state acceptor for 'take out' |
6/26: trying to make the program backtrack when it gets to <ANY_CHAR> or <ANY_TAG> | |
6 | 7/3 - 7/9 | hard-coded finite-state transducer for 'take out' |
7/3: tried to convert the python script to c++ code. trying to use lttoolbox's FST class. | |
7 | 7/10 - 7/16 | xml format, working compiler and processor | ||
8 | 7/17 - 7/23 | improving dictionary, compiler, and processor | ||
Second evaluation | 7/24 - 7/28 | finite-state implementation | ||
9 | 7/24 - 7/30 | integration with Apertium: fit module between pre-transfer and lt-proc-b | ||
10 | 7/31 - 8/6 | superblanks, integration, fstp |
7/31: worked on superblanks, used fstp object | |
11 | 8/7 - 8/13 | (cont. support for individual language pairs) |
testing and refining for beta testing languages: kaz/kir, deu, eng, fao-nor | |
12 | 8/14 - 8/20 | (cont. support for individual language pairs) |
support for +thing, lsx-comp appends <j/> to the end of every entry, before <e/>, causes issues with paradigm, remove feature. | |
13 | 8/21 - 8/27 | (cont. support for individual language pairs) | ||
Final evaluation | 8/29 - 9/5 | finite-state implementation in C++ with lttoolbox |