Difference between revisions of "User:SilentFlame/Progress"

From Apertium
Jump to navigation Jump to search
 
(12 intermediate revisions by one other user not shown)
Line 2: Line 2:


==Current task==
==Current task==
===Interchunk===
===lttoolbox===
* Make lt-proc correctly disperse inline blanks onto each lexical unit until the next <code><nowiki>[</nowiki></code>
# Apply changes to transfer.cc to interchunk.cc
#* Check <code>git clone -b blank-handling https://github.com/unhammer/apertium</code>
* Task: Create a pull request to https://github.com/unhammer/lttoolbox/ with tests in https://github.com/unhammer/lttoolbox/tree/master/tests/lt_proc

#* Apply the diff (between that branch and master) from transfer.cc to interchunk.cc
#* Try to make it compile and run – report things that didn't seem to have a 1-1 correspondence
#* Write tests for interchunk, like those for transfer at https://github.com/unhammer/apertium/tree/blank-handling/tests


==TODO==
==TODO==
===hfst===
* Make hfst-proc correctly disperse inline blanks onto each lexical unit until the next <code><nowiki>[</nowiki></code>
* Task: Create a pull request to https://github.com/hfst/hfst/ with tests in https://github.com/hfst/hfst/tree/master/test/tools/


===Deformatters===
===transfer (non-chunking)===
* Test if current transfer.cc handles non-chunking/single-stage transfer correctly, if not, fix
* Complete prototype HTML deformatters
* Task: PR to https://github.com/unhammer/apertium/ with tests showing working transfer.cc for single-stage/non-chunking transfer, with inline vs block-level blank handling and test that rules using misnumbered/missing b-elements should not mess up formatting
** Current prototype code at https://github.com/junaidiiith/apertium / https://github.com/junaidiiith/Apertium_Code and https://github.com/SilentFlame/apertium/
** Task: Create a clean pull request to https://github.com/unhammer with HTML deformatter and reformatter, including tests

===Reformatters===

* Make reformat turn inline-blanks back into real tags
** <nowiki>[{&lt;i&gt;}]foo [{&lt;i&gt;&lt;b&gt;}]bar</nowiki> should become &lt;i&gt;foo&lt;/i&gt; &lt;i&gt;&lt;b&gt;bar&lt;/b&gt;&lt;/i&gt;
** prototypes exist for this in https://github.com/junaidiiith/apertium / https://github.com/junaidiiith/Apertium_Code

===lttoolbox===
* Make lt-proc correctly disperse inline blanks onto each lexical unit until the next <code><nowiki>[</nowiki></code>


===postchunk===
===postchunk===
(Should be done after interchunk is complete)
(Should be done after interchunk is complete)

* Task: PR to https://github.com/unhammer/apertium/ including tests showing working postchunk blank handling – test that rules using misnumbered/missing b-elements should not mess up formatting


===etc===
===etc===
* Ensure all other modules are fine with the new format for inline blanks
* Ensure all other modules are fine with the new format for inline blanks (e.g. cg-proc)
* Work on other deformatters (mediawiki? latex?)
* Work on other deformatters (mediawiki? latex?)


Line 53: Line 46:
#* uncommenting apertium/transfer.cc:1259 <code> // delete[] format;</code> in the blank handling branch leads to a double-free – find out why and ensure we're correctly releasing memory
#* uncommenting apertium/transfer.cc:1259 <code> // delete[] format;</code> in the blank handling branch leads to a double-free – find out why and ensure we're correctly releasing memory
#* Install valgrind from your package manager or http://valgrind.org/, then compile your program with -O0 -g3, then run <code>valgrind -v --leak-check=full apertium/apertium-transfer</code> and read the output
#* Install valgrind from your package manager or http://valgrind.org/, then compile your program with -O0 -g3, then run <code>valgrind -v --leak-check=full apertium/apertium-transfer</code> and read the output

===Interchunk===
Interchunk needs to ignore the "pos" argument to b elements, and output each superblank exactly once, preferably where the rule has a b element (if there are not enough b's, output the rest at the end of the rule).
Interchunk shouldn't have to deal with wordblanks, since we can't look inside chunks when in interchunk.

# Apply changes to transfer.cc to interchunk.cc
#* Check <code>git clone -b blank-handling https://github.com/unhammer/apertium</code>
#* Apply the <code>git diff 4c7c4f8f1b..2025182991</code> from transfer.cc to interchunk.cc
#* Try to make it compile and run – report things that didn't seem to have a 1-1 correspondence
#* Write tests for interchunk, like those for transfer at https://github.com/unhammer/apertium/tree/blank-handling/tests

===Deformatters===
* Complete prototype HTML deformatters
** Current prototype code at https://github.com/junaidiiith/apertium / https://github.com/junaidiiith/Apertium_Code and https://github.com/SilentFlame/apertium/
** Task: Create a clean pull request to https://github.com/unhammer with HTML deformatter and reformatter, including tests

===Reformatters===

* Make reformat turn inline-blanks back into real tags
** <nowiki>[{&lt;i&gt;}]foo [{&lt;i&gt;&lt;b&gt;}]bar</nowiki> should become &lt;i&gt;foo&lt;/i&gt; &lt;i&gt;&lt;b&gt;bar&lt;/b&gt;&lt;/i&gt;
** prototypes exist for this in https://github.com/junaidiiith/apertium / https://github.com/junaidiiith/Apertium_Code

Latest revision as of 20:11, 16 July 2017

Progress on Automatic_blank_handling

Current task[edit]

lttoolbox[edit]


TODO[edit]

hfst[edit]

transfer (non-chunking)[edit]

  • Test if current transfer.cc handles non-chunking/single-stage transfer correctly, if not, fix
  • Task: PR to https://github.com/unhammer/apertium/ with tests showing working transfer.cc for single-stage/non-chunking transfer, with inline vs block-level blank handling and test that rules using misnumbered/missing b-elements should not mess up formatting

postchunk[edit]

(Should be done after interchunk is complete)

  • Task: PR to https://github.com/unhammer/apertium/ including tests showing working postchunk blank handling – test that rules using misnumbered/missing b-elements should not mess up formatting

etc[edit]

  • Ensure all other modules are fine with the new format for inline blanks (e.g. cg-proc)
  • Work on other deformatters (mediawiki? latex?)

Done[edit]

(Some of these are from coding challenges)

deformatting prototypes[edit]

  1. Make the HTML format handler apertium-deshtml turn "<i>foo <b>bar</b></i>" into "[{<i>}]foo [{<i><b>}]bar"

pretransfer[edit]

transfer (chunker)[edit]

  1. Fix a memory bug
    • uncommenting apertium/transfer.cc:1259 // delete[] format; in the blank handling branch leads to a double-free – find out why and ensure we're correctly releasing memory
    • Install valgrind from your package manager or http://valgrind.org/, then compile your program with -O0 -g3, then run valgrind -v --leak-check=full apertium/apertium-transfer and read the output

Interchunk[edit]

Interchunk needs to ignore the "pos" argument to b elements, and output each superblank exactly once, preferably where the rule has a b element (if there are not enough b's, output the rest at the end of the rule). Interchunk shouldn't have to deal with wordblanks, since we can't look inside chunks when in interchunk.

  1. Apply changes to transfer.cc to interchunk.cc

Deformatters[edit]

Reformatters[edit]