Difference between revisions of "User:N0nick/GSoC Journal"
Jump to navigation
Jump to search
(Created page with '==Bonding Period week 1: 4/25-5/1== * Got the development environment ready. apertium, lttoolbox and other tools and tests all working properly. * Filled the [[Maltese_and_Hebrew…') |
|||
(19 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
==Bonding Period week 1: 4/25-5/1== |
==Bonding Period week 1: 4/25-5/1== |
||
* Got the development environment ready. apertium, lttoolbox and other tools and tests all working properly. |
* Got the development environment ready. apertium, lttoolbox and other tools and tests all working properly. |
||
* Filled the [[Maltese_and_Hebrew/Pending_tests|Pending Tests] page with some translations (based on the ones in [[Maltese_and_English/Pending_tests|the mt-en page]]. |
* Filled the [[Maltese_and_Hebrew/Pending_tests|Pending Tests]] page with some translations (based on the ones in [[Maltese_and_English/Pending_tests|the mt-en page]]). |
||
* Started working on a script to generate a Maltese monodix from external sources. Nothing to show yet. |
* Started working on a script to generate a Maltese monodix from external sources. Nothing to show yet. |
||
** Some Maltese newspapers (suggested by spectie): [http://www.l-orizzont.com/] [http://www.it-torca.com/] [http://www.kullhadd.com/] [http://www.il-gensillum.com/] |
** Some Maltese newspapers (suggested by spectie): [http://www.l-orizzont.com/] [http://www.it-torca.com/] [http://www.kullhadd.com/] [http://www.il-gensillum.com/] |
||
Line 7: | Line 7: | ||
* Notified 2 TAU professors (both specializing in CL) about my project, both agreed to offer help if necessary. |
* Notified 2 TAU professors (both specializing in CL) about my project, both agreed to offer help if necessary. |
||
* Wrote to a contact related to the [http://staff.um.edu.mt/mros1/maltilex/ MaltiLex project], looking for better contact (perhaps through my university's faculty). |
* Wrote to a contact related to the [http://staff.um.edu.mt/mros1/maltilex/ MaltiLex project], looking for better contact (perhaps through my university's faculty). |
||
==Bonding Period week 2: 5/2 - 5/8== |
|||
* Picked up and started reading the [http://books.google.com/books?id=iCdjAAAAMAAJ Teach Yourself Maltese] grammar book. |
|||
* Wrote the framework for a script that generates fullform Maltese verb lists. [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mt-he/dev/mt_verbs] |
|||
** We worked on splitting verbs into categorizes and optional subclasses, writing rules (based on stem affixes, roots and vowels) in a python script for each class. |
|||
** Found out about the way [Wiktionary] stores conjugation data about the verbs it has; very useful for creating new rule groups. [http://en.wiktionary.org/wiki/Category:Maltese_conjugation-table_templates]. Finished converting these tables this week except for [http://en.wiktionary.org/wiki/Special:WhatLinksHere/Template:mt-conj/sem/I/eaq-iaq] [http://en.wiktionary.org/wiki/Special:WhatLinksHere/Template:mt-conj/sem/I/oo-oo] (that are identical to strong.py apart for a transformation in imperfect forms). |
|||
* [[User:Francis Tyers|spectre]] contacted Prof. Adam Ussishkin and he provided us with Maltese verb lists that we need to look over. [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mt-he/dev/verb_lists] |
|||
* Contacted [http://johnjcamilleri.com/ John J. Camilleri] regarding his [http://www.grammaticalframework.org/doc/gfss/status-john.pdf Maltese morphology slideshow], asking for data on verb conjugation. |
|||
* Will add the verbs already added to the Hebrew and bidi dix files. |
|||
==Bonding Period week 3: 5/9 - 5/15== |
|||
* Continued studying Maltese from the grammar book. |
|||
* Added verbs from John Camilleri's slideshow to the Maltese analyser program. |
|||
==Bonding Period week 4: 5/16 - 5/22== |
|||
* Added closed categories to the Maltese analyser (pronouns, prepositions, conjunctions, determiners, numerals) |
|||
* Added closed categories to the Hebrew dictionary |
|||
* Added closed categories to the bidix |
|||
==Week 1: 5/23 - 5/29== |
|||
* Fixed bugs in some of the environment tools I was using |
|||
* Added missing Hebrew determiners and pronouns |
|||
* Added missing closed categories to the bidix |
|||
==Week 2: 5/30 - 6/5== |
|||
* Generate Hebrew verb speling file from hspell output |
|||
* Format Hebrew verbs speling file as Apertium dix file |
|||
* Research handling of attached/clitic pronouns on both Maltese & Hebrew |
|||
==Week 3: 6/6 -6/12== |
|||
* Fixed Hebrew noun paradigms (automatically generated from hspell) |
|||
* Add existing verbs to bidix |
|||
* Add missing determiners to Maltese file (as per Fran's email) |
|||
==Week 4: 6/13 - 6/19== |
|||
--- (Studied for exam, haven't achieved much) |
|||
==Week 5: 6/20 - 6/26== |
|||
* Fully analyse all words in [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mt-he/dev/paragraph1.txt sample paragraph 1] |
|||
* Added all paragraph words to bidix, then update Hebrew dix accordingly |
|||
* Added a list of Maltese proper nouns, adverbs and updated dix, bidix |
|||
* Fixed bugs in hspell output for plural nouns, verbs |
|||
==Week 6: 6/27 - 7/3== |
|||
* Worked on mt.dix to achieve better coverage of Maltese corpus |
|||
* Added lists of adjectives generated from 'suspected' lists (according to suffixes, etc) |
|||
* Used Wiktionary extracting script to load nouns, adjectives from Maltese items in English Wiktionary |
|||
* Used wider Maltese corpus received from Kevin Scnanell |
|||
* Added more Maltese nouns, adjectives and verbs from top of frequency list |
|||
* Contacted Michael Spagnol re [http://www.um.edu.mt/__data/assets/pdf_file/0006/123990/MayerEtAl-Broken_Plural.pdf Maltese Broken Plurals] to receive list of nouns with broken plural form. |
|||
* Fixed several noun paradigms |
|||
* Searched for documentation on the kien/ikun verb, differences and forms. Contacted Adam Ussishkin & john Camilleri for help with this. |
|||
==Week 7: 7/4 - 7/10== |
|||
* Added more nouns, adjectives, adverbs from frequency list |
|||
* Updated verbs.py interface, added option to set dictionary restriction |
|||
* Added -x negative suffix for all verbs (in verbs.py) |
|||
* Fixed all forms of kien/ikun (with help from Kevin & John) |
|||
* Added verbs, verb classes from frequency list + corpus |
|||
* Added all verified 630 broken plural nouns & adjectives from Tamra Schembri's thesis with sg, pl and gender=GD |
|||
* Fixed existing words with gender we have in mt-en dictionary |
|||
* Categorized top 1900 words from hitparade frequency list |
|||
* Finally acquired the Maltese Descriptive Grammar book! |
|||
==Week 8: 7/11 - 7/17== |
|||
* Added more verb paradigms and stems from grammar book |
|||
* Wrote gen_stems.py for updating the stems file with ones handled by the new verbs script (temporary solution) |
|||
* Added many Maltese nouns, adjectives from mt-en dictionary |
|||
* Added '@' terms to bidix by frequency: closed-cats, nouns, adjectives, toponyms and some verbs |
|||
==Week 9: 7/18 - 7/24== |
|||
* Added most determiners to bidix |
|||
* Added all nouns that has only masc. form to bidix |
|||
* Fixed gender transfer for verbs (copied to pronouns) |
|||
* Added most (~550) adjectives to bidix |
|||
* Fixed some bad / wrong entries in mt.dix |
|||
==Week 10: 7/25 - 7/31== |
|||
* Fixed bugs in our modification to hspell that outputs the Hebrew verb dix |
|||
* Added most (~150) adverbs to bidix |
|||
* Added most (~480) proper nouns to bidix |
|||
* Added some determiners |
|||
* Fixed some bad entries in the bidix |
|||
[[Category:Maltese and Hebrew]] |
Latest revision as of 09:06, 5 August 2011
Contents
- 1 Bonding Period week 1: 4/25-5/1
- 2 Bonding Period week 2: 5/2 - 5/8
- 3 Bonding Period week 3: 5/9 - 5/15
- 4 Bonding Period week 4: 5/16 - 5/22
- 5 Week 1: 5/23 - 5/29
- 6 Week 2: 5/30 - 6/5
- 7 Week 3: 6/6 -6/12
- 8 Week 4: 6/13 - 6/19
- 9 Week 5: 6/20 - 6/26
- 10 Week 6: 6/27 - 7/3
- 11 Week 7: 7/4 - 7/10
- 12 Week 8: 7/11 - 7/17
- 13 Week 9: 7/18 - 7/24
- 14 Week 10: 7/25 - 7/31
Bonding Period week 1: 4/25-5/1[edit]
- Got the development environment ready. apertium, lttoolbox and other tools and tests all working properly.
- Filled the Pending Tests page with some translations (based on the ones in the mt-en page).
- Started working on a script to generate a Maltese monodix from external sources. Nothing to show yet.
- Notified 2 TAU professors (both specializing in CL) about my project, both agreed to offer help if necessary.
- Wrote to a contact related to the MaltiLex project, looking for better contact (perhaps through my university's faculty).
Bonding Period week 2: 5/2 - 5/8[edit]
- Picked up and started reading the Teach Yourself Maltese grammar book.
- Wrote the framework for a script that generates fullform Maltese verb lists. [5]
- We worked on splitting verbs into categorizes and optional subclasses, writing rules (based on stem affixes, roots and vowels) in a python script for each class.
- Found out about the way [Wiktionary] stores conjugation data about the verbs it has; very useful for creating new rule groups. [6]. Finished converting these tables this week except for [7] [8] (that are identical to strong.py apart for a transformation in imperfect forms).
- spectre contacted Prof. Adam Ussishkin and he provided us with Maltese verb lists that we need to look over. [9]
- Contacted John J. Camilleri regarding his Maltese morphology slideshow, asking for data on verb conjugation.
- Will add the verbs already added to the Hebrew and bidi dix files.
Bonding Period week 3: 5/9 - 5/15[edit]
- Continued studying Maltese from the grammar book.
- Added verbs from John Camilleri's slideshow to the Maltese analyser program.
Bonding Period week 4: 5/16 - 5/22[edit]
- Added closed categories to the Maltese analyser (pronouns, prepositions, conjunctions, determiners, numerals)
- Added closed categories to the Hebrew dictionary
- Added closed categories to the bidix
Week 1: 5/23 - 5/29[edit]
- Fixed bugs in some of the environment tools I was using
- Added missing Hebrew determiners and pronouns
- Added missing closed categories to the bidix
Week 2: 5/30 - 6/5[edit]
- Generate Hebrew verb speling file from hspell output
- Format Hebrew verbs speling file as Apertium dix file
- Research handling of attached/clitic pronouns on both Maltese & Hebrew
Week 3: 6/6 -6/12[edit]
- Fixed Hebrew noun paradigms (automatically generated from hspell)
- Add existing verbs to bidix
- Add missing determiners to Maltese file (as per Fran's email)
Week 4: 6/13 - 6/19[edit]
--- (Studied for exam, haven't achieved much)
Week 5: 6/20 - 6/26[edit]
- Fully analyse all words in sample paragraph 1
- Added all paragraph words to bidix, then update Hebrew dix accordingly
- Added a list of Maltese proper nouns, adverbs and updated dix, bidix
- Fixed bugs in hspell output for plural nouns, verbs
Week 6: 6/27 - 7/3[edit]
- Worked on mt.dix to achieve better coverage of Maltese corpus
- Added lists of adjectives generated from 'suspected' lists (according to suffixes, etc)
- Used Wiktionary extracting script to load nouns, adjectives from Maltese items in English Wiktionary
- Used wider Maltese corpus received from Kevin Scnanell
- Added more Maltese nouns, adjectives and verbs from top of frequency list
- Contacted Michael Spagnol re Maltese Broken Plurals to receive list of nouns with broken plural form.
- Fixed several noun paradigms
- Searched for documentation on the kien/ikun verb, differences and forms. Contacted Adam Ussishkin & john Camilleri for help with this.
Week 7: 7/4 - 7/10[edit]
- Added more nouns, adjectives, adverbs from frequency list
- Updated verbs.py interface, added option to set dictionary restriction
- Added -x negative suffix for all verbs (in verbs.py)
- Fixed all forms of kien/ikun (with help from Kevin & John)
- Added verbs, verb classes from frequency list + corpus
- Added all verified 630 broken plural nouns & adjectives from Tamra Schembri's thesis with sg, pl and gender=GD
- Fixed existing words with gender we have in mt-en dictionary
- Categorized top 1900 words from hitparade frequency list
- Finally acquired the Maltese Descriptive Grammar book!
Week 8: 7/11 - 7/17[edit]
- Added more verb paradigms and stems from grammar book
- Wrote gen_stems.py for updating the stems file with ones handled by the new verbs script (temporary solution)
- Added many Maltese nouns, adjectives from mt-en dictionary
- Added '@' terms to bidix by frequency: closed-cats, nouns, adjectives, toponyms and some verbs
Week 9: 7/18 - 7/24[edit]
- Added most determiners to bidix
- Added all nouns that has only masc. form to bidix
- Fixed gender transfer for verbs (copied to pronouns)
- Added most (~550) adjectives to bidix
- Fixed some bad / wrong entries in mt.dix
Week 10: 7/25 - 7/31[edit]
- Fixed bugs in our modification to hspell that outputs the Hebrew verb dix
- Added most (~150) adverbs to bidix
- Added most (~480) proper nouns to bidix
- Added some determiners
- Fixed some bad entries in the bidix