Difference between revisions of "User:Khannatanmai/GSoC2020Progress"

From Apertium
Jump to navigation Jump to search
Line 9: Line 9:
 
== Community Bonding Period (May 4 - June 1) ==
 
== Community Bonding Period (May 4 - June 1) ==
 
* Write tests to prepare for test driven development
 
* Write tests to prepare for test driven development
  +
* Testcase: lemq comes from variable
* Analyse the code of the parsers of the modules
 
  +
* Create test t1x file which covers all test cases.
* Fix transfer behaviour with LUs with invariable parts and MLUs
 
  +
* Need to deal with sec tags appearing before lemq if lemq comes from variable
 
   
 
= Completed =
 
= Completed =
Line 23: Line 23:
 
* Document the change needed in tokeniser, bidix lookup, and generation to include surface form: [[User:Khannatanmai/Eliminating_Dictionary_Trimming]]
 
* Document the change needed in tokeniser, bidix lookup, and generation to include surface form: [[User:Khannatanmai/Eliminating_Dictionary_Trimming]]
 
* Document all the proposed benefits with including secondary information
 
* Document all the proposed benefits with including secondary information
  +
   
 
== Community Bonding Period (May 4 - June 1) ==
 
== Community Bonding Period (May 4 - June 1) ==
Line 28: Line 29:
 
* Modifying transfer to pass secondary tags ahead. Updates can be found [https://wiki.apertium.org/wiki/User:Khannatanmai/New_Apertium_stream_format#Progress here].
 
* Modifying transfer to pass secondary tags ahead. Updates can be found [https://wiki.apertium.org/wiki/User:Khannatanmai/New_Apertium_stream_format#Progress here].
 
* Modify generator to ignore secondary tags while matching
 
* Modify generator to ignore secondary tags while matching
  +
* Deal with MLUs in generator, and special characters in sectags, etc.
 
* Analyse the code of the parsers of the modules
 
* Fix transfer behaviour with LUs with invariable parts and MLUs
 
* Need to deal with sec tags appearing before lemq if lemq comes from variable

Revision as of 20:48, 17 May 2020

Work Plan: http://wiki.apertium.org/wiki/User:Khannatanmai/GSoC2020Proposal_Trimming#Work_Plan

To Do

Community Bonding Period (May 4 - June 1)

  • Run some experiments with the new stream format
  • Check error handling of secondary tags in transfer (stream error, empty prefix, etc.)

Ongoing

Community Bonding Period (May 4 - June 1)

  • Write tests to prepare for test driven development
  • Testcase: lemq comes from variable
  • Create test t1x file which covers all test cases.


Completed

Application Review Period (March 31 - May 3)

  • Compile all the discussion about the modification to the stream format (in talk pages)
  • Create dedicated page for the development of the new stream format: User:Khannatanmai/New_Apertium_stream_format
  • Going through the documentation again and reading the wikis for each module just to ensure I haven't missed anything in the overall working of Apertium as I've never really made a language pair.
  • http://wiki.apertium.org/wiki/User:Khannatanmai/New_Apertium_stream_format : Document modification to Apertium stream format (see talk pages for relevant discussion)
  • Document how much change is needed in which parsers and what the change is
  • Proof of Concept for the new format
  • Document the change needed in tokeniser, bidix lookup, and generation to include surface form: User:Khannatanmai/Eliminating_Dictionary_Trimming
  • Document all the proposed benefits with including secondary information


Community Bonding Period (May 4 - June 1)

  • Create a suitable development and debugging environment for the pipe (Xcode)
  • Modifying transfer to pass secondary tags ahead. Updates can be found here.
  • Modify generator to ignore secondary tags while matching
  • Deal with MLUs in generator, and special characters in sectags, etc.
  • Analyse the code of the parsers of the modules
  • Fix transfer behaviour with LUs with invariable parts and MLUs
  • Need to deal with sec tags appearing before lemq if lemq comes from variable