Difference between revisions of "User:Popcorndude/Recursive Transfer/Progress"
Popcorndude (talk | contribs)  (→Week 1:  update for yesterday)  | 
				Popcorndude (talk | contribs)   (→Week 1:  idea about multi-output rules)  | 
				||
| Line 159: | Line 159: | ||
May 28: Can now parse a basic test file and we're maybe 2/3 of the way through generating the LR parsing table (multiple output nodes may be more complicated than I thought).  | 
  May 28: Can now parse a basic test file and we're maybe 2/3 of the way through generating the LR parsing table (multiple output nodes may be more complicated than I thought).  | 
||
May 29: Idea: in a rule like  | 
|||
 clitic NP -> @adj @cli @n {2} {1 _1 _2 3};  | 
|||
The parser just has to treat <code>NP</code> as both terminal and non-terminal, and when it applies this rule it can then reduce <code>clitic</code> and pretend that <code>NP</code> was the next token in the stream.  | 
|||
== Week 2 ==  | 
  == Week 2 ==  | 
||
Revision as of 13:27, 29 May 2019
Contents
Work Plan (from proposal)
| Time Period | Goal | Details | Deliverable | 
|---|---|---|---|
| Community Bonding Period
 May 6-26  | 
Finalize formalism | 
  | 
Full description of planned formalism | 
| Week 1
 May 27-June 2  | 
Begin parser | 
  | 
Minimal parser | 
| Week 2
 June 3-9  | 
Add variables | 
  | 
Minimal parser with agreement | 
| Week 3
 June 10-16  | 
Test with eng->spa | 
  | 
Simple eng->spa parser | 
| Week 4
 June 17-23  | 
Continue parser | 
  | 
Majority of initial specifications implemented | 
| evaluation 1 | Basic parser done | Parser-generator compliant with majority of initial specifications and rudimentary eng->spa instantiation | |
| Week 5
 June 24-30  | 
Finish parser and continue eng->spa | 
  | 
Fully implemented parser and working eng->spa for simple sentences | 
| Week 6
 July 1-7  | 
Finish eng->spa and write reverser | 
  | 
System comparison and rule-reverser | 
| Week 7
 July 8-14  | 
Evaluation and testing | 
  | 
Test suite and report on the general effectiveness of direct rule-reversal | 
| Week 8
 July 15-21  | 
Optimization and interface | 
  | 
Command-line interfaces and updated system comparison | 
| evaluation 2 | Complete program | Optimized and polished parser-generator compliant with initial specifications, and complete end->spa transfer rules | |
| Week 9
 July 22-28  | 
Do spa->eng | 
  | 
Working spa->eng rules and report on the usefulness of rule-reverser | 
| Week 10
 July 29-August 4  | 
Documentation | 
  | 
Complete documentation of system | 
| Weeks 11 and 12
 August 5-18  | 
Buffer zone | 
 These weeks will be used for one of the following, depending on preceding weeks and discussions with mentors: 
  | 
TBD | 
| final evaluation | Project done | Complete, fully documented system with full ruleset for at least one language pair | 
Community Bonding
Todo list
Determine exact semantics of lexical unit tag-matchingAre they ordered?Are they consecutive?
- See if anyone has input on formalism syntax in general
 - Mechanism for clitic-insertion
- e.g. V2, Wackernagel
 
 - Read about GLR parser algorithms
- Find reading materials
 - Is there anything that can be done to make this finite-state? (probably not)
 - Should we just start with the naive implementation (what the Python script does) as a baseline?
 
 Conjoined lexical units - just treat as consecutive elements with no blank between?Syntax for mapping between sets of tags (e.g. <o3pl> -> <p3><pl>, <o3sg> -> <p3><sg>)Conditional output (e.g. modal verbs in English)Make sure all syntax is written down- Begin writing tests
 - Some way to match absence of a tag
 
May 25: LU tags are unordered (basically, every tag operation has the same semantics as <clip>, <equal><clip>..., or <let><clip>... in the chunker). Various other things have syntax but that syntax may not be properly documented yet.
Week 1
May 27: As it turns out, I should have accounted for building a compiler/parser-generator in the workplan. We can currently mostly parse the file. Todo for tomorrow: finish parsing reduction rules and add some more error messages.
May 28: Can now parse a basic test file and we're maybe 2/3 of the way through generating the LR parsing table (multiple output nodes may be more complicated than I thought).
May 29: Idea: in a rule like
clitic NP -> @adj @cli @n {2} {1 _1 _2 3};
The parser just has to treat NP as both terminal and non-terminal, and when it applies this rule it can then reduce clitic and pretend that NP was the next token in the stream.