From Apertium
Jump to navigation Jump to search


Week Dates Goals Progress/Notes Evaluation
1 5/30 - 6/4 some data, find test corpus
2 6/5 - 6/11 script to bootstrap separable multiwords from dictionaries, set up testing framework, support/preparing data for English separable verbs
3 6/12 - 6/18 preparing data, prototype script set up, read specifications of Lttoolbox API
4 6/19 - 6/25

6/19: separate out the language-dependent functions in the c++ prototype, work on reordering module for Romance languages (Spanish, Portuguese)
6/20: work on reordering module for Germanic language (Swedish)
6/21: work on reordering module for Celtic language (welsh)
6/22: work on reordering module for
6/23: work on reordering module for & prototype should be finished

6/19: Still trying to get FST example to compile on my computer. Worked on the reordering module for English in c++
6/20: Spent a lot of time trying to fix hashing errors with the c++ prototype, and then gave up with c++. Switched everything to python because I've been spending too much time working out c++ API. Wrote documentation. Still trying to get unittest++ and the fst example to compile.
6/21: Separated the prototype (now scripted in python) into organized files. 6/23: video meeting. 6/24: migrated to source forge. added testing set.

First evaluation 6/26 - 6/30 testing framework set up + prototype system in Python
5 6/26 - 7/2 hard-coded finite-state acceptor for 'take out'

6/26: trying to make the program backtrack when it gets to <ANY_CHAR> or <ANY_TAG>
6/27: debugging & supporting any number of tags
6/28: successfully reads and prints ^take<vblex><pres><tag1><tag2><tag3><tag4>$ ^the<det><tag1><tag2><tag3><tag4>$ ^thing<n><sg><tag><Tag>$ ^out<adv>$
6/29: working on being selective about what middle words are accepted
7/1: python prototype for acceptor is pretty much working, just needs to be able to read from corpuses that don't put every sentence on a new line, and to assign numbers to states in a more elegant fashion.

6 7/3 - 7/9 hard-coded finite-state transducer for 'take out'

7/3: tried to convert the python script to c++ code. trying to use lttoolbox's FST class.
7/4: still trying to convert to c++ and use lttoolbox
7/5: 7/6: 7/7: 7/8: 7/9:

7 7/10 - 7/16 xml format, working compiler and processor
8 7/17 - 7/23 improving dictionary, compiler, and processor
Second evaluation 7/24 - 7/28 finite-state implementation
9 7/24 - 7/30 integration with Apertium: fit module between pre-transfer and lt-proc-b
10 7/31 - 8/6 superblanks, integration, fstp

7/31: worked on superblanks, used fstp object
8/1: improvements
8/2: insert superblanks between the # in e.g. 'take# out' and between words in counterexamples by amending 'in' and 'out' strings, fixed error where ^ was not printing at the end of reordering a sep. multiword, updated dictionary to improve success rate

11 8/7 - 8/13 (cont. support for individual language pairs)
12 8/14 - 8/20 (cont. support for individual language pairs)
13 8/21 - 8/27 (cont. support for individual language pairs)
Final evaluation 8/29 - 9/5 finite-state implementation in C++ with lttoolbox