Difference between revisions of "User:Popcorndude/Recursive Transfer/Progress"
Jump to navigation
Jump to search
Popcorndude (talk | contribs) |
Popcorndude (talk | contribs) |
||
Line 1: | Line 1: | ||
+ | == Work Plan (from [[User:Popcorndude/Recursive_Transfer | proposal]]) == |
||
+ | |||
+ | {| class="wikitable" border="1" |
||
+ | |- |
||
+ | ! Time Period |
||
+ | ! Goal |
||
+ | ! Details |
||
+ | ! Deliverable |
||
+ | |- |
||
+ | | Community Bonding Period |
||
+ | May 6-26 |
||
+ | | Finalize formalism |
||
+ | | |
||
+ | * Read up on GLR parsers |
||
+ | * Decide variable semantics and syntax |
||
+ | * See if there's a good way to handle interpolation (e.g. inserting clitics after first word of phrase) |
||
+ | | Full description of planned formalism |
||
+ | |- |
||
+ | | Week 1 |
||
+ | May 27-June 2 |
||
+ | | Begin parser |
||
+ | | |
||
+ | * Get input |
||
+ | * Match and build trees based on literal tags and attribute categories |
||
+ | | Minimal parser |
||
+ | |- |
||
+ | | Week 2 |
||
+ | June 3-9 |
||
+ | | Add variables |
||
+ | | |
||
+ | * Agreement |
||
+ | * Passing variables up the tree |
||
+ | * Setting variables for child nodes |
||
+ | | Minimal parser with agreement |
||
+ | |- |
||
+ | | Week 3 |
||
+ | June 10-16 |
||
+ | | Test with eng->spa |
||
+ | | |
||
+ | * Noun phrases (this was started in the coding challenge) |
||
+ | * Basic verb phrases (some agreement, if time) |
||
+ | | Simple eng->spa parser |
||
+ | |- |
||
+ | | Week 4 |
||
+ | June 17-23 |
||
+ | | Continue parser |
||
+ | | |
||
+ | * Weights |
||
+ | * Conditionals |
||
+ | * Multiple output nodes |
||
+ | * Anything else deemed necessary during Community Bonding or testing |
||
+ | | Majority of initial specifications implemented |
||
+ | |- |
||
+ | | '''evaluation 1''' |
||
+ | | Basic parser done |
||
+ | | |
||
+ | | Parser-generator compliant with majority of initial specifications and rudimentary eng->spa instantiation |
||
+ | |- |
||
+ | | Week 5 |
||
+ | June 24-30 |
||
+ | | Finish parser and continue eng->spa |
||
+ | | |
||
+ | * Finish anything left over from week 4 |
||
+ | * Finish verb phrases |
||
+ | | Fully implemented parser and working eng->spa for simple sentences |
||
+ | |- |
||
+ | | Week 6 |
||
+ | July 1-7 |
||
+ | | Finish eng->spa and write reverser |
||
+ | | |
||
+ | * Convert any remaining eng->spa rules |
||
+ | * Evaluate parser against chunking system |
||
+ | ** Metrics: accuracy, speed of parser, compilation speed |
||
+ | * Write script to automatically reverse a ruleset |
||
+ | ** All features currently described are at least in princible reversible |
||
+ | | System comparison and rule-reverser |
||
+ | |- |
||
+ | | Week 7 |
||
+ | July 8-14 |
||
+ | | Evaluation and testing |
||
+ | | |
||
+ | * Evaluate the output of the reverser against current spa->eng system |
||
+ | * Write tests for all features |
||
+ | * Begin adding error messages |
||
+ | | Test suite and report on the general effectiveness of direct rule-reversal |
||
+ | |- |
||
+ | | Week 8 |
||
+ | July 15-21 |
||
+ | | Optimization and interface |
||
+ | | |
||
+ | * Speed up the parser and compiler where possible |
||
+ | * Build interfaces for compiler, parser, and reverser |
||
+ | * Clean up code |
||
+ | * Re-evaluate speed |
||
+ | | Command-line interfaces and updated system comparison |
||
+ | |- |
||
+ | | '''evaluation 2''' |
||
+ | | Complete program |
||
+ | | |
||
+ | | Optimized and polished parser-generator compliant with initial specifications, and complete end->spa transfer rules |
||
+ | |- |
||
+ | | Week 9 |
||
+ | July 22-28 |
||
+ | | Do spa->eng |
||
+ | | |
||
+ | * Identify differences between generated spa->eng and chunking spa->eng |
||
+ | * Fix generated spa->eng rules |
||
+ | * Report on effort required to correct reverser |
||
+ | | Working spa->eng rules and report on the usefulness of rule-reverser |
||
+ | |- |
||
+ | | Week 10 |
||
+ | July 29-August 4 |
||
+ | | Documentation |
||
+ | | |
||
+ | * Convert initial specifications to full documentation |
||
+ | * Write tutorial |
||
+ | * Write recipe book containing at least minimal examples of everything listed at [[User_talk:Popcorndude/Recursive_Transfer#Linguistic.2Ftransfer_phenomena]] |
||
+ | | Complete documentation of system |
||
+ | |- |
||
+ | | Weeks 11 and 12 |
||
+ | August 5-18 |
||
+ | | Buffer zone |
||
+ | | |
||
+ | These weeks will be used for one of the following, depending on preceding weeks and discussions with mentors: |
||
+ | * Make up for delays in prior weeks |
||
+ | * Converting another language pair |
||
+ | * Experimenting with automated conversion of chunking rules |
||
+ | * Writing a ruleset composer for generating a preliminary ruleset from two other pairs (e.g. combine eng->spa and spa->cat to get approximate rules for eng->cat) |
||
+ | | TBD |
||
+ | |- |
||
+ | | '''final evaluation''' |
||
+ | | Project done |
||
+ | | |
||
+ | | Complete, fully documented system with full ruleset for at least one language pair |
||
+ | |} |
||
== Community Bonding == |
== Community Bonding == |
||
=== Todo list === |
=== Todo list === |
||
− | * Determine exact semantics of lexical unit tag-matching |
+ | * <s>Determine exact semantics of lexical unit tag-matching</s> |
− | ** Are they ordered? |
+ | ** <s>Are they ordered?</s> |
− | ** Are they consecutive? |
+ | ** <s>Are they consecutive?</s> |
* See if anyone has input on formalism syntax in general |
* See if anyone has input on formalism syntax in general |
||
* Mechanism for clitic-insertion |
* Mechanism for clitic-insertion |
||
Line 11: | Line 146: | ||
** Is there anything that can be done to make this finite-state? (probably not) |
** Is there anything that can be done to make this finite-state? (probably not) |
||
** Should we just start with the naive implementation (what the Python script does) as a baseline? |
** Should we just start with the naive implementation (what the Python script does) as a baseline? |
||
− | * Conjoined lexical units - just treat as consecutive elements with no blank between? |
+ | * <s>Conjoined lexical units - just treat as consecutive elements with no blank between?</s> |
− | * Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>) |
+ | * <s>Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>)</s> |
− | * Conditional output (e.g. modal verbs in English) |
+ | * <s>Conditional output (e.g. modal verbs in English)</s> |
+ | * Make sure all syntax is written down |
||
+ | * Begin writing tests |
||
+ | * Some way to match absence of a tag |
||
+ | |||
+ | May 25: LU tags are unordered (basically, every tag operation has the same semantics as <clip>, <equal><clip>..., or <let><clip>... in the chunker). Various other things have syntax but that syntax may not be properly documented yet. |
||
== Week 1 == |
== Week 1 == |
Revision as of 22:04, 25 May 2019
Contents
Work Plan (from proposal)
Time Period | Goal | Details | Deliverable |
---|---|---|---|
Community Bonding Period
May 6-26 |
Finalize formalism |
|
Full description of planned formalism |
Week 1
May 27-June 2 |
Begin parser |
|
Minimal parser |
Week 2
June 3-9 |
Add variables |
|
Minimal parser with agreement |
Week 3
June 10-16 |
Test with eng->spa |
|
Simple eng->spa parser |
Week 4
June 17-23 |
Continue parser |
|
Majority of initial specifications implemented |
evaluation 1 | Basic parser done | Parser-generator compliant with majority of initial specifications and rudimentary eng->spa instantiation | |
Week 5
June 24-30 |
Finish parser and continue eng->spa |
|
Fully implemented parser and working eng->spa for simple sentences |
Week 6
July 1-7 |
Finish eng->spa and write reverser |
|
System comparison and rule-reverser |
Week 7
July 8-14 |
Evaluation and testing |
|
Test suite and report on the general effectiveness of direct rule-reversal |
Week 8
July 15-21 |
Optimization and interface |
|
Command-line interfaces and updated system comparison |
evaluation 2 | Complete program | Optimized and polished parser-generator compliant with initial specifications, and complete end->spa transfer rules | |
Week 9
July 22-28 |
Do spa->eng |
|
Working spa->eng rules and report on the usefulness of rule-reverser |
Week 10
July 29-August 4 |
Documentation |
|
Complete documentation of system |
Weeks 11 and 12
August 5-18 |
Buffer zone |
These weeks will be used for one of the following, depending on preceding weeks and discussions with mentors:
|
TBD |
final evaluation | Project done | Complete, fully documented system with full ruleset for at least one language pair |
Community Bonding
Todo list
Determine exact semantics of lexical unit tag-matchingAre they ordered?Are they consecutive?
- See if anyone has input on formalism syntax in general
- Mechanism for clitic-insertion
- e.g. V2, Wackernagel
- Read about GLR parser algorithms
- Find reading materials
- Is there anything that can be done to make this finite-state? (probably not)
- Should we just start with the naive implementation (what the Python script does) as a baseline?
Conjoined lexical units - just treat as consecutive elements with no blank between?Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>)Conditional output (e.g. modal verbs in English)- Make sure all syntax is written down
- Begin writing tests
- Some way to match absence of a tag
May 25: LU tags are unordered (basically, every tag operation has the same semantics as <clip>, <equal><clip>..., or <let><clip>... in the chunker). Various other things have syntax but that syntax may not be properly documented yet.