From Apertium
Jump to navigation Jump to search


on github

Week Dates Goals Progress/Notes Evaluation
1 5/30 - 6/4 some data, find test corpus
2 6/5 - 6/11 script to bootstrap separable multiwords from dictionaries, set up testing framework, support/preparing data for English separable verbs
3 6/12 - 6/18 preparing data, prototype script set up, read specifications of Lttoolbox API
4 6/19 - 6/25

6/19: separate out the language-dependent functions in the c++ prototype, work on reordering module for Romance languages (Spanish, Portuguese)
6/20: work on reordering module for Germanic language (Swedish)
6/21: work on reordering module for Celtic language (welsh)
6/22: work on reordering module for
6/23: work on reordering module for && prototype should be finished

6/19: Still trying to get FST example to compile on my computer. Worked on the reordering module for English in c++
6/20: Spent a lot of time trying to fix hashing errors with the c++ prototype, and then gave up with c++. Switched everything to python because I've been spending too much time working out c++ API. Wrote documentation. Still trying to get unittest++ and the fst example to compile.
6/21: Separated the prototype (now scripted in python) into organized files. The reordering module for English and Spanish are under control (there is still some tedious work to with them, which I was not able to get to today). I was I think I am feeling much more confident in being able to get the prototype working by the evaluation deadline this week. I had listed a bunch of languages in my original proposal, except I couldn't find data on separable verbs except for German (. I feel a little out-of-place for fiddling with languages that I am not familiar with. However, I am not confident at all that any of this will even be of use (other than the basic idea) when we try to integrate it into Apertium because most of my code will probably be replaced with existing Lttoolbox functions. :( 6/22: I think I accomplished very little today...Wrote a small handful of tests (python's unittest is great). Slightly enhanced the module: put deliminators (commas, periods) in the right place when transferring to output file; combined the multiword (e.g. 'take<>#out' rather than 'take<> out<>'. I'm really losing motivation to continue enhancing the module because I know none of the prototype script (whether it's in python or c) will be of much actual use. I'm looking forward to next week, after the first deadline is over, so I will be able to spend more time understanding Lttoolbox API. I found it frustrating on days when I did not accomplish very many useful lines of code at the end of the day, but I tried to keep in mind that a significant portion of any programming project is to read and modify existing code.

First evaluation 6/26 - 6/30 testing framework set up + prototype system in Python
5 6/26 - 7/2
6 7/3 - 7/9
7 7/10 - 7/16
8 7/17 - 7/23
Second evaluation 7/24 - 7/28 XML representation, finite-state implementation
9 7/24 - 7/30 integration with Apertium: fit module between pre-transfer and lt-proc-b
10 7/31 - 8/6 support for individual language pairs
11 8/7 - 8/13 (cont. support for individual language pairs)
12 8/14 - 8/20 (cont. support for individual language pairs)
13 8/21 - 8/27 (cont. support for individual language pairs)
Final evaluation 8/29 - 9/5 finite-state implementation in C++ with lttoolbox