Difference between revisions of "User:Popcorndude/Recursive Transfer/Progress"
Jump to navigation
Jump to search
Popcorndude (talk | contribs) |
Popcorndude (talk | contribs) |
||
Line 1: | Line 1: | ||
== Work Plan (from [[User:Popcorndude/Recursive_Transfer | proposal]]) == |
|||
{| class="wikitable" border="1" |
|||
|- |
|||
! Time Period |
|||
! Goal |
|||
! Details |
|||
! Deliverable |
|||
|- |
|||
| Community Bonding Period |
|||
May 6-26 |
|||
| Finalize formalism |
|||
| |
|||
* Read up on GLR parsers |
|||
* Decide variable semantics and syntax |
|||
* See if there's a good way to handle interpolation (e.g. inserting clitics after first word of phrase) |
|||
| Full description of planned formalism |
|||
|- |
|||
| Week 1 |
|||
May 27-June 2 |
|||
| Begin parser |
|||
| |
|||
* Get input |
|||
* Match and build trees based on literal tags and attribute categories |
|||
| Minimal parser |
|||
|- |
|||
| Week 2 |
|||
June 3-9 |
|||
| Add variables |
|||
| |
|||
* Agreement |
|||
* Passing variables up the tree |
|||
* Setting variables for child nodes |
|||
| Minimal parser with agreement |
|||
|- |
|||
| Week 3 |
|||
June 10-16 |
|||
| Test with eng->spa |
|||
| |
|||
* Noun phrases (this was started in the coding challenge) |
|||
* Basic verb phrases (some agreement, if time) |
|||
| Simple eng->spa parser |
|||
|- |
|||
| Week 4 |
|||
June 17-23 |
|||
| Continue parser |
|||
| |
|||
* Weights |
|||
* Conditionals |
|||
* Multiple output nodes |
|||
* Anything else deemed necessary during Community Bonding or testing |
|||
| Majority of initial specifications implemented |
|||
|- |
|||
| '''evaluation 1''' |
|||
| Basic parser done |
|||
| |
|||
| Parser-generator compliant with majority of initial specifications and rudimentary eng->spa instantiation |
|||
|- |
|||
| Week 5 |
|||
June 24-30 |
|||
| Finish parser and continue eng->spa |
|||
| |
|||
* Finish anything left over from week 4 |
|||
* Finish verb phrases |
|||
| Fully implemented parser and working eng->spa for simple sentences |
|||
|- |
|||
| Week 6 |
|||
July 1-7 |
|||
| Finish eng->spa and write reverser |
|||
| |
|||
* Convert any remaining eng->spa rules |
|||
* Evaluate parser against chunking system |
|||
** Metrics: accuracy, speed of parser, compilation speed |
|||
* Write script to automatically reverse a ruleset |
|||
** All features currently described are at least in princible reversible |
|||
| System comparison and rule-reverser |
|||
|- |
|||
| Week 7 |
|||
July 8-14 |
|||
| Evaluation and testing |
|||
| |
|||
* Evaluate the output of the reverser against current spa->eng system |
|||
* Write tests for all features |
|||
* Begin adding error messages |
|||
| Test suite and report on the general effectiveness of direct rule-reversal |
|||
|- |
|||
| Week 8 |
|||
July 15-21 |
|||
| Optimization and interface |
|||
| |
|||
* Speed up the parser and compiler where possible |
|||
* Build interfaces for compiler, parser, and reverser |
|||
* Clean up code |
|||
* Re-evaluate speed |
|||
| Command-line interfaces and updated system comparison |
|||
|- |
|||
| '''evaluation 2''' |
|||
| Complete program |
|||
| |
|||
| Optimized and polished parser-generator compliant with initial specifications, and complete end->spa transfer rules |
|||
|- |
|||
| Week 9 |
|||
July 22-28 |
|||
| Do spa->eng |
|||
| |
|||
* Identify differences between generated spa->eng and chunking spa->eng |
|||
* Fix generated spa->eng rules |
|||
* Report on effort required to correct reverser |
|||
| Working spa->eng rules and report on the usefulness of rule-reverser |
|||
|- |
|||
| Week 10 |
|||
July 29-August 4 |
|||
| Documentation |
|||
| |
|||
* Convert initial specifications to full documentation |
|||
* Write tutorial |
|||
* Write recipe book containing at least minimal examples of everything listed at [[User_talk:Popcorndude/Recursive_Transfer#Linguistic.2Ftransfer_phenomena]] |
|||
| Complete documentation of system |
|||
|- |
|||
| Weeks 11 and 12 |
|||
August 5-18 |
|||
| Buffer zone |
|||
| |
|||
These weeks will be used for one of the following, depending on preceding weeks and discussions with mentors: |
|||
* Make up for delays in prior weeks |
|||
* Converting another language pair |
|||
* Experimenting with automated conversion of chunking rules |
|||
* Writing a ruleset composer for generating a preliminary ruleset from two other pairs (e.g. combine eng->spa and spa->cat to get approximate rules for eng->cat) |
|||
| TBD |
|||
|- |
|||
| '''final evaluation''' |
|||
| Project done |
|||
| |
|||
| Complete, fully documented system with full ruleset for at least one language pair |
|||
|} |
|||
== Community Bonding == |
== Community Bonding == |
||
=== Todo list === |
=== Todo list === |
||
* Determine exact semantics of lexical unit tag-matching |
* <s>Determine exact semantics of lexical unit tag-matching</s> |
||
** Are they ordered? |
** <s>Are they ordered?</s> |
||
** Are they consecutive? |
** <s>Are they consecutive?</s> |
||
* See if anyone has input on formalism syntax in general |
* See if anyone has input on formalism syntax in general |
||
* Mechanism for clitic-insertion |
* Mechanism for clitic-insertion |
||
Line 11: | Line 146: | ||
** Is there anything that can be done to make this finite-state? (probably not) |
** Is there anything that can be done to make this finite-state? (probably not) |
||
** Should we just start with the naive implementation (what the Python script does) as a baseline? |
** Should we just start with the naive implementation (what the Python script does) as a baseline? |
||
* Conjoined lexical units - just treat as consecutive elements with no blank between? |
* <s>Conjoined lexical units - just treat as consecutive elements with no blank between?</s> |
||
* Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>) |
* <s>Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>)</s> |
||
* Conditional output (e.g. modal verbs in English) |
* <s>Conditional output (e.g. modal verbs in English)</s> |
||
* Make sure all syntax is written down |
|||
* Begin writing tests |
|||
* Some way to match absence of a tag |
|||
May 25: LU tags are unordered (basically, every tag operation has the same semantics as <clip>, <equal><clip>..., or <let><clip>... in the chunker). Various other things have syntax but that syntax may not be properly documented yet. |
|||
== Week 1 == |
== Week 1 == |
Revision as of 22:04, 25 May 2019
Contents
Work Plan (from proposal)
Time Period | Goal | Details | Deliverable |
---|---|---|---|
Community Bonding Period
May 6-26 |
Finalize formalism |
|
Full description of planned formalism |
Week 1
May 27-June 2 |
Begin parser |
|
Minimal parser |
Week 2
June 3-9 |
Add variables |
|
Minimal parser with agreement |
Week 3
June 10-16 |
Test with eng->spa |
|
Simple eng->spa parser |
Week 4
June 17-23 |
Continue parser |
|
Majority of initial specifications implemented |
evaluation 1 | Basic parser done | Parser-generator compliant with majority of initial specifications and rudimentary eng->spa instantiation | |
Week 5
June 24-30 |
Finish parser and continue eng->spa |
|
Fully implemented parser and working eng->spa for simple sentences |
Week 6
July 1-7 |
Finish eng->spa and write reverser |
|
System comparison and rule-reverser |
Week 7
July 8-14 |
Evaluation and testing |
|
Test suite and report on the general effectiveness of direct rule-reversal |
Week 8
July 15-21 |
Optimization and interface |
|
Command-line interfaces and updated system comparison |
evaluation 2 | Complete program | Optimized and polished parser-generator compliant with initial specifications, and complete end->spa transfer rules | |
Week 9
July 22-28 |
Do spa->eng |
|
Working spa->eng rules and report on the usefulness of rule-reverser |
Week 10
July 29-August 4 |
Documentation |
|
Complete documentation of system |
Weeks 11 and 12
August 5-18 |
Buffer zone |
These weeks will be used for one of the following, depending on preceding weeks and discussions with mentors:
|
TBD |
final evaluation | Project done | Complete, fully documented system with full ruleset for at least one language pair |
Community Bonding
Todo list
Determine exact semantics of lexical unit tag-matchingAre they ordered?Are they consecutive?
- See if anyone has input on formalism syntax in general
- Mechanism for clitic-insertion
- e.g. V2, Wackernagel
- Read about GLR parser algorithms
- Find reading materials
- Is there anything that can be done to make this finite-state? (probably not)
- Should we just start with the naive implementation (what the Python script does) as a baseline?
Conjoined lexical units - just treat as consecutive elements with no blank between?Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>)Conditional output (e.g. modal verbs in English)- Make sure all syntax is written down
- Begin writing tests
- Some way to match absence of a tag
May 25: LU tags are unordered (basically, every tag operation has the same semantics as <clip>, <equal><clip>..., or <let><clip>... in the chunker). Various other things have syntax but that syntax may not be properly documented yet.