Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.


From Apertium
Jump to: navigation, search

Progress on Automatic_blank_handling


[edit] Current task

[edit] lttoolbox

[edit] TODO

[edit] hfst

[edit] transfer (non-chunking)

  • Test if current transfer.cc handles non-chunking/single-stage transfer correctly, if not, fix
  • Task: PR to https://github.com/unhammer/apertium/ with tests showing working transfer.cc for single-stage/non-chunking transfer, with inline vs block-level blank handling and test that rules using misnumbered/missing b-elements should not mess up formatting

[edit] postchunk

(Should be done after interchunk is complete)

  • Task: PR to https://github.com/unhammer/apertium/ including tests showing working postchunk blank handling – test that rules using misnumbered/missing b-elements should not mess up formatting

[edit] etc

  • Ensure all other modules are fine with the new format for inline blanks (e.g. cg-proc)
  • Work on other deformatters (mediawiki? latex?)

[edit] Done

(Some of these are from coding challenges)

[edit] deformatting prototypes

  1. Make the HTML format handler apertium-deshtml turn "<i>foo <b>bar</b></i>" into "[{<i>}]foo [{<i><b>}]bar"

[edit] pretransfer

[edit] transfer (chunker)

  1. Fix a memory bug
    • uncommenting apertium/transfer.cc:1259 // delete[] format; in the blank handling branch leads to a double-free – find out why and ensure we're correctly releasing memory
    • Install valgrind from your package manager or http://valgrind.org/, then compile your program with -O0 -g3, then run valgrind -v --leak-check=full apertium/apertium-transfer and read the output

[edit] Interchunk

Interchunk needs to ignore the "pos" argument to b elements, and output each superblank exactly once, preferably where the rule has a b element (if there are not enough b's, output the rest at the end of the rule). Interchunk shouldn't have to deal with wordblanks, since we can't look inside chunks when in interchunk.

  1. Apply changes to transfer.cc to interchunk.cc

[edit] Deformatters

[edit] Reformatters

Personal tools