User:Ggregori
Contents
About me
Name: Gabriel Gregori Manzano
Email/Google chat: Email me
IRC nick: ggregori
GSoC 2011
VM for the transfer module - Application
Github repository: [1]
TODO list
- Research and experiment with the topics mentioned by my mentor:
- implementation of UNIX wildcards.
- Finish generating the code for the macros tests.
- Define and create the expected output for some rules tests.
- Develop code generation for the rules tests.
- Generate an entire transfer rules file for some pair to test the compiler.
Weekly reports
Community Bonding Period
Week 1 - (25/04 - 01/05): Basically this week has been dedicated to research/review some topics (some of them suggested by my mentor)
- I have been reviewing NLP and Python using 'Natural Language Processing with Python' book.
- I have been looking for a way to represent morphological labels in UCS/UTF and my mentor suggested using negative numbers as in Apertium internals. Anyway, I can worry about this later.
- Using UTF with Python: 'codecs' and 'unicodedata' can be some useful modules.
- Testing the option 'lt-proc' -b which is going to be the input of my compiler.
Week 2 - (02/05 - 08/05): This week I ended all the review/research needed, although I couldn't do all I wanted because I had to travel.
- Ended with the introductory book reviewing NLP and Python.
- Started designing and redefining the compiler's architecture following last year work and selected and did some tests with some modules. Some of the changes or improvements:
- Use of pipes/command-line arguments for the input of the compiler (like the rest of Apertium).
- Configurable logging module for info and debugging purposes (module: logging).
- Refactoring some methods in the expatparser class (e.g. extracting common code of the callback method).
- Create some additional classes in order to add some flexibility (e.g. parent class parser with the common code).
Week 3 - (09/05 - 15/05): This week I had to redo some work because of the Python3 switch, so didn't accomplish want I wanted. Anyway, two weeks of university classes remaining until I can focus exclusively in this project.
- Switched to Python 3, reasons:
- I hope to get better UTF-8 support among other things.
- Had to test if the modules I use were fully available/compatible in Python3.
- Had to read and research (again...) about str/bytes and std{in,out}.buffer and, in general, everything related to Unicode, UTF-8...
- Started implementing the really basics of the compiler’s architecture:
- Command-line arguments and help, input and output, logging...
- Another think I realized this week is that a lot of the thinking done last week about trying to make a flexible prototype so it is easy to modify in the future doesn’t really apply to Python. For example, my design involved creating interfaces/abstract classes in order to be able to easily change components, but that in Python isn’t needed. In conclusion: duck-typing, although I will need my design in the C++ version.
Coding Period
Week 1 - (16/05 - 29/05): This last days have been impossible with university work, just this week I had like 4 class projects and 2 exams... Tuesday next week I will finish everything and will be able to focus completely on my project.
Week 2 - (30/05 - 05/06): Finally I can focus completely on my project and this week I have developed a lot the compiler:
- Finished the structure of the project, now I am ready to start generating code from the transfer rules.
- Created the Github repository where I will submit my work (link is at the top).
- Implemented all the handling of the sections: def-cats, def-attrs, def-vars, def-lists and def-macros.
- Created some test macros with the desired output in pseudo-assembly.
- Implemented the generation of code for the elements: <choose>, <when>, <test>, <not>, <equal>, b, <lit>
- Improved some of the code, creating a SymbolTable, separating debugging output and actual output etc.
Week 3 - (06/06 - 12/06): This week I pretend to finish generating the psedo-assembly and to deliver the finished compiler.